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Preface 



When in October 1996 in Cholula (Puebla, Mexico), I took charge of organizing 
the scientific program of the next Ibero-American Congress on Artificial Intelli- 
gence (IBERAMIA 98) I bet on a couple of ideas. First, I adopted the spirit of 
the Portuguese adventurers to get the Sixth Congress on a truly international 
track. In order to attain this aim I needed to convince everybody that the Ibero- 
American AI community had improved over the years and attained a very good 
level in what concerns individuals. Second, I brought my colleagues beside me 
so that we were able to collect sufficient excellent papers without destroying the 
pioneering spirit of those who first inaugurated the Congress. Getting together 
to find out what is in progress in the vast region in which Latin languages (Por- 
tuguese and Spanish) are spoken, attracting others to exchange ideas with us, 
and by doing this advancing AI in general, is a risky untertaking. This book is 
the result, and it sets a new standard to be discussed by all of us. 

IBERAMIA was established in 1988 (Barcelona) by three Ibero-American AI 
Associations (AEPIA from Spain, SMIA from Mexico, and APPIA from Portu- 
gal), after a first meeting in Morelia (Mexico) in 1986 of SMIA and AEPIA. The 
event was organized every two years from then on in Morelia (1990), La Habana 
(1992), Caracas (1994), and Cholula (1996), taking Portuguese and Spanish as 
official languages and with the aim to promote and diffuse the research and 
development carried out in the countries associated with those two Latin lan- 
guages and connected by strong historical links from the 16th century. Over the 
years, the Executive Committee of IBERAMIA was enlarged with the inclusion 
of AVINTA (Venezuela), SMC (Cuba) and SBC (Brazil). 

The IBERAMIA 98 scientific program is structured along two main mod- 
ules, the open discussion and the paper track. The first day of the conference 
(Tuesday, October 6, 1998) is organized with tutorials directed to informatics 
professionals, the formal opening, the IBERAMIA lecture delivered by a distin- 
guished Ibero-American researcher, and the declaration of the Jose Negrete prize 
awarded by the Scientific Committee to the best paper submitted. The open dis- 
cussion track (Wednesday, October 7) is composed of working sessions devoted 
to the most important areas of research in Ibero-American countries, the AI Ed- 
ucation Symposium dedicated to confronting ideas about the best ways to teach 
AI, a session presenting the best M.Sc. and Ph.D. theses of the whole region, 
and a video conference panel to establish bridges between Europe and America 
(involving those unable to attend this panel). The paper track (Thursday and 
Friday, October 8-9) is composed of invited talks and paper presentations from 
all over the world on the full range of AI research and covering both theoretical 
and foundational issues, and applications as well. 

We received more than 150 technical papers distributed along 21 countries 
and 14 areas (see below). From those 149 accepted for reviewing, 30 were written 
in Spanish and Portuguese, and were the only candidates to the open discussion 
track. All were rigorously reviewed by the program committee, and only 32 were 
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accepted as full papers to be published by Springer (the paper track). This 
high rejection rate was a reflection of the care and thought that the program 
committee and area chairs put into the review and selection process in order to 
obtain a standard of quality for the first publication of IBERAMIA proceedings 
by Springer. 



Countries Submitted 


Paper Track 




Argentina 


2 


- 




Austria 


1 


1 




Brazil 


14 


4 




Canada 


1 


- 




Colombia 


2 


- 




Cuba 


1 


- 




Ecuador 


1 


- 




Erance 


12 


2 




Germany 


1 


- 




Italy 


1 


- 




Japan 


2 


1 




Mexico 


26 


7 




Netherlands 


1 


- 




Paraguay 


1 


- 




Portugal 


16 


3 




Singapore 


1 


- 




Spain 


45 


10 




Tunisia 


1 


- 




UK 


9 


2 




USA 


2 


- 




Venezuela 


9 


2 




Total 


149 


32 




Areas 




Submitted 


Paper Track 


Case-Based Reasoning 




5 


- 


Constraint Programming 


5 


1 


Distributed AI 




22 


5 


Genetic Algorithms 




15 


5 


Intelligent Tutoring Systems 


9 


1 


Knowledge Engineering 


5 


2 


Knowledge Representation 


12 


- 


Machine Learning 




17 


6 


Natural Language Processing 


18 


1 


Neural Nets 




9 


1 


Planning 




3 


1 


Reasoning 




13 


5 


Robotics 




6 


- 


Vision 




10 


4 


Total 




149 


32 




Preface VII 



We have included five invited lectures by Hector Geffner, James Allen, Cristiano 
Castelfranchi, Aaron Sloman, and Ricardo Baeza- Yates to ensure an adequate 
interaction between different fields, and also to assure a broadening of the the- 
matic spectrum of the whole conference. 

The Conference is accompanied by two workshops, the Second Ibero-Ameri- 
can DAI and Multiagent Systems Workshop, to be held in Toledo, on October 
2-3, 1998, and the First Ibero-American Causal Networks (From Inference to 
Data Mining), to be held in Lisbon, on October 3, 1998, and also by a set of 
four tutorials: Agent Programming by Jose M. Ramirez (Venezuela), Mining 
the World Wide Web by Tom Mitchell (USA), Intelligent Information Retrieval 
by Ricardo Baeza- Yates (Chile), and Arquitecturas Multiagentes y sus Aplica- 
ciones, by Fernando de Arriaga (Spain), Ana Lilia Laureano Cruces (Mexico), 
and Mohamed El Alami (Spain). In the week before, September 28 - October 4, 
the AI Portuguese association (APPIA) organizes the Sixth International Sum- 
mer School (EAIA-98) dedicated to Knowledge Discovery in Databases and Data 
Mining: Methods and Applications. 

We would like to thank the following institutions that contributed (finan- 
cially or otherwise) to the organization of this conference and to the editing 
of the proceedings: Eundacao Calouste Gulbenkian, ESPRIT, IBM Portugal, 
Eundacao para Ciencia e a Tecnologia, Caixa Geral de Depositos, Agenda Abreu, 
Compulog-Net, Uniao Latina, Eival, British Council, RDP-Radio Difusao Por- 
tuguesa, SA, Camara Municipal de Lisboa, ELAD-Eundacao Luso-Americana 
para o Desenvolvimento, EDP-Electricidade de Portugal, Edicoes Colibri, Edi- 
torial Verbo and Esoterica. 

Particular thanks are due to all those who helped us with the local orga- 
nization, namely Gabriel Lopes, Antonio Ribeiro, Berilhes Garcia, Gael Dias, 
Irene Rodrigues, Joao Balsa, Joaquim Eerreira da Silva, Nuno Marques, Paulo 
Quaresma, Sergio Ereitas, and Vitor Rocio. The final thanks go to Springer- 
Verlag for their help and assistance in producing this book. 
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Modelling Intelligent Behaviour: 

The Markov Decision Process Approach 



Hector Geffner 

Depto de Computacion 
Universidad Simon Bolivar 
Aptdo. 89000, Caracas, Venezuela 
http : // WWW . Idc . usb . ve/~hect or 



Abstract. The problem of selecting action in environments that are 
dynamic and not completely predictable or observable is a central prob- 
lem in intelligent behavior. From an AI point of view, the problem is 
to design a mechanism that can select the best actions given informa- 
tion provided by sensors and a suitable model of the actions and goals. 
We call this the problem of Planning as it is a direct generalization of 
the problem considered in Planning research where feedback is absent 
and the effect of actions is assumed to be predictable. In this paper we 
present an approach to Planning that combines ideas and methods from 
Operations Research and Artificial Intelligence. Basically Planning prob- 
lems are described in high-level action languages that are compiled into 
general mathematical models of sequential decisions known as Markov 
Decision Processes or Partially Observable Markov Decision Processes, 
which are then solved by suitable Heuristic Search Algorithms. The re- 
sult are controllers that map sequences of observations into actions, and 
which, under certain conditions can be shown to be optimal. We show 
how this approach applies to a number of concrete problems and discuss 
its relation to work in Reinforcement Learning. 



1 Introduction 

The problem of selecting actions in environments that are dynamic and not 
completely predictable is a central problem in AI. Given a model of the actions 
and goals, the problem is to produce a controller that can map sequences of 
observations into suitable actions [27]. We call this the problem of Planning as 
it is a general version of problem traditionally considered in Planning research 
where feedback is absent and actions are assumed to be deterministic (e.g., [23, 
27]). In this paper we will be concerned with this problem and show how it can 
be addressed by a suitable combination of models, languages and algorithms. 
Models will allow us to understand the problem, while languages and algorithms 
will allow us to represent and solve specihc problem instances. 

A simple planning problem is Levesque’s Omelette Problem [20] that involves 
an agent that has a large supply of eggs and whose goal is to get three good 
eggs and no bad ones into one of two bowls. The eggs can be either good or bad. 
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and at any time the agen t can find out whether a bcwl contains a bad egg by 
inspecting the bowl. A sensible plan for this problem is to follow a loop in which 
an egg is grabbed from the pile and brok en into a bcwl. If the egg is good, it’s 
passed to the other bowl, else it is discarded. The loop is con timed until three 
eggs have been passed. 

The Planning problem is ho w to model problems of this jspe and how to 
obtain the corresponding plans from a suitable description of the actions and 
goals. As this example illustrates, the form of plans is intimately related to the 
presence or absence of feedback. In general, the sequence of actions depends on 
the observations. W e refer to planning in the presence of observations asclosed- 
loop planning^ and planning in the absence of observations as open-loop planning 
[24, 13, 1]. Classical planning is open-loop planning, while contingent and reac- 
tive planning are forms of closed-loop planning. While open-loop planning has 
been the focus of most researc h in AI Planning, closed-loop planning is normally 
regarded as superior as it is more robust. Closed-loop plans can reco ver from per- 
turbations (e.g., a block falling off the gripper) and errors in the initial conditions 
or action models (e.g., like assuming that actions are deterministic when they 
are not), while open-loop plans cannot. Closed-loop planning, ho wevr, requires 
more sophisticated models, languages and algorithms. Models have to make pre- 
cise what closed-loop plans are, languages have to make room for both physical 
and information gathering actions, and algorithms ha ve to pTodneefunctions 
mapping observ ation sequences into actions. 

In this paper we present an approach to closed-loop planning that is based 
on these three elements. First we review.3EARCH, MDP, and POMDP models for 
planning, then we consider a suitable combination of heuristic search and dy- 
namic programming algorithms for solving these models, and hnally, we consider 
high-level action languages for representing MDPs and POMDPs in a convenient 
way We have actually implemen ted a shell that supports this approah, and 
given a high-level representation of actions and goals, produces the appropri- 
ate controllers [16]. W e report empirical results for a number of problems, and 
discuss how this approach relates to current ideas in Reinforcemen t Learning 
[31]. 

2 Models 

Three standard mathematical models of sequential decisions allow us to make 
precise what open and closed-loop plans are, and how they can be derived from 
suitable descriptions of actions and goals. They are SEARCH models, Mark ov 
Decision Processes or MDPs, and Partially Observable Markov Decision Processes 
or POMDPs. 

2.1 Search Models 

SEARCH models [22] are the most basic action models in AI and are charac- 
terized by three assumptions: the initial state is completely known, actions are 
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deterministic, and their effects are not observable. Formally a SEARCH problem 
is comprised of 

— a state space S 

— an initial state sq G S 

— actions A(s) C A applicable in each state s E S 

— a transition function /(s,a) for s E S and a E A(s) 

— action costs c(a, s) > 0 

— a set G C S' of goal states 

A solution of a SEARCH problem is a sequence of actions uq, ui, . . . , that 
generates a state trajectory sq^ si = /(^o), • • • , = f{si^ai) such that each 

action Oi is applicable in Si and Sn+i is a goal state, i.e., E A(si) and Sn+i ^ G- 
The solution is optimal when the total cost is minimized. 

Classical planning^ i.e., open-loop planning with complete knowledge of the 
initial situation and deterministic actions, is a SEARCH problem where states are 
represented by sets of atoms, action costs are all equal, and both the transition 
function / and the sets of executable actions A(s) are dehned in a high-level 
language such as Strips. The computational problem in classical planning has 
been approached by looking at planning as a nearly decomposable problem [33]. 
Recent work suggests other formulations and algorithms that may scale up better 
[3, 17], and in [5] we argue that SEARCH methods can scale up well too, provided 
a suitable heuristic function is obtained from the Strips representation of the 
problem (see also [21]). 

2.2 MDPs 

Markov Decision Processes (mdps) [26, 2] differ from SEARCH models in two 
main respects: they accommodate probabilistic actions^ and they assume that 
the effect of actions is fully observable. An MDP is thus given by:^ 

— a state space S 

— actions A(s) C A applicable in each state s E S 

— transition probabilities Pa{s^\s) for s E A and a E 

— action costs c(a, s) > 0 

— a set G C 5^ of goal states 

The state that results from a state S{ and an action a{ is not predictable but 
is observable^ and hence provides feedback for selecting the next action As a 
result, a solution of an MDP is not an action sequence, but a function tt mapping 
states s into applicable actions a E A(s). Such a function is called a policy. A 
policy 7T assigns a probability to every state trajectory sq, si, S2? • • • starting in a 
state soy that is given by the product of all transition probabilities Pai(si-^i\si) 

^ We are considering a subclass of MDPs, the so-called stochastic shortest-path MDPs 
[2]. For general treatments, see [2] and [26]. For uses of MDPs in Planning and AI, 
see [30,7,1,27]. 
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where ai = 7t(s 4). W e assume that actions in goal states hae/no costs and no 
effects (i.e., c(a, s) = 0 and Pa{s\s) = 1 if s C G). The expected cost associated 
with a policy tt starting in state s is the weighted awrage of the probability 
of such trajectories times their cost optimal solution is a 

control policy tt* that has a minim um expected cost for all states? C S. 

While Classical Planning can be formulated as a SEARCH problem, Closed- 
loop Planning with complete information can be formulated as anMDP [11, 1]. 
The desired closed-loop plans are the optimal policies tt*. 

2.3 POMDPs 

POMDPs generalize MDPs allowing the state to be partially observable [28,8]. In- 
formation about the state comes from observations o whose probabilities Pa{o\s) 
depend on the action a performed and the unobserv ed but true resulting state 
s. In addition, a prior probability distribution over the states encodes the prior 
belief about the initial state of the environment. A POMDP is thus characterized 
by: 



— states s CS 

— actions A(s) A applicable in each state s 

— costs c(a, s) > 0 of performing action a in s 

— transition probabilities Pa{s^\s) for s CS and a C A(s) 

— initial belief state 

— hnal belief states bp 

— observations o after action a with probabilities Pa{o\s) 

Since feedback from the environment is only partial, the solution of a POMDP 
is not a function mapping states in to actions, but a function mapping belief 
states into actions, where belief states b are probability distributions over the 
real states s of the environmen t. The effect of the actions on belief states is 
completely predictable. Indeed the belief state ba that results from performing 
action a in the belief state b is: 

^a(«) = X! Pcc{sW)b{s') (1) 

while the belief state 6^ that results from performing action a in 6 and then 
observing o is 

K{s) = Pa{o\s)ba{s)/ba{o) ( 2 ) 

where ba{o) is the probability of observing o after doing a in 6 given by 

’^aio) = Pa{o\s)ba{s) ( 3 ) 

Actions a thus transform a belief state b into a new belief 6^ with probability 
ba{o). The planning task is to go from the initial belief state bo to a hnal belief 
state bp at a minim um expected cost. This is nothing else but aiMDP over belief 
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space^ where the real states s of the environment have been replaced by belief 
states b. The ‘goal’ belief states bp can be dehned in various ways; as the belief 
states in which we are certain to have reached a real goal state (i.e., b{s) = 0 for 
s ^ G), as the belief states in which we are pretty sure to have reached a goal 
state (i.e., J2seG b(s) > 1 — e), etc. 

Closed-loop Planning with Incomplete Information is a POMDP problem whose 
solutions are the closed-loop policies mapping belief states into actions. In the AI 
literature, such policies are often represented by contingent plans, i.e., sequential 
plans extended with tests and branches [14, 10, 20]. 

3 Algorithms 

We have seen that Planning problems can be formulated as either SEARCH, MDP, 
or POMDP problems according to the type of feedback (observations) available to 
the planner at run time. Techniques for solving SEARCH problems are reviewed in 
details in most AI texts; e.g., [27], and include algorithms such as A* and greedy 
search, all based on an heuristic function h{s) that estimates the cost from any 
state 5 to a goal state. A* is guaranteed to hnd the optimal action sequence 
when the heuristic function is admissible (i.e., does not overestimate the true 
cost the goal), yet it may take exponential space. Greedy search, on the other 
hand, takes constant space but is not guaranteed to hnd optimal solutions, or 
any solutions at all. An heuristic search algorithm for classical planning problems 
was shown to be competitive with the state of the art planning algorithms such 
as GRAPHPLAN and SATPLAN in [6]. The algorithm is basically a version of A* 
with a limited buffer size, and uses an heuristic function extracted from the 
Strips encoding of the problem to guide the search. 

While extensions of the heuristic search methods such as AO* [23, 25] could 
be used to solve problems with probabilistic actions^ the standard approach to 
solve MBPS and POMDPs is by means of dynamic programming methods [26,2]. 
The idea is to compute from any state s (or belief state b) the optimal expected 
cost y(s) (V{b)) to the reach a goal and then use these values to select the 
optimal actions. These optimal expected costs V{-) obey the following hxed 
point equations in SEARCH, MDPs, and POMDPs models 



SEARCH: V{s) 


= min [c(a, s) + U(sa)] 

a^A{s) 


(4) 


MDP: U(s) 


= minjc(a,s) + ^P„(s'|s)V(s')l 

a6A(s) 

s' 


(5) 


POMDP: V{b) 


= min [c(a, + ba{o)V {b°)] 

b^A{b) 


(6) 



Value iteration hnds the solution to these equations by an iteration method 
in which initial estimates Vo are plugged on the right hand side of the equations 
to yield new estimates V\ on the left hand side, which are used again to get 
estimates V 2 and so on. Under suitable conditions it can be proved that the 
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1 Evaluate each action a applicable in s as: 

Q{a,s) - c{a,s) + ^ 

s'es 

initializing V (^0 ^(*^0 when needed 

2 Apply action a with minim unQ(a, value, breaking ties randomly 

3 Update U(5) to Q(a, 

4 Observ e resulting state 

5 Exit if is a goal, else set s to and go to 1 



Fig. 1. RTDP Loop 



estimates Vi approach the optimal values wheni — )► oo [26,2]. The operation 
that yields new estimates in terms of old estimates Vi is called an update. 
Thus, each step of value iteration involves a parallel update of the estimates 
Vi{s) for all s E S. 

Problems in volving up to hundred of thousands of states can be solved by 
these methods in a reasonable amoun t time and memor.yYet larger problems are 
not uncommon in AI, and certainly arise in POMDPs where the set of belief states 
is continuous and infinite. The problem with methods such as value iteration is 
that they compute the optimal v alues oMl (belief) states. In many problems, 
however, only the value of a small n um ber of states matter. This occurs in 
particular in problems in whic h the initial states (or b) is known a priori. In 
such cases, most of the updates in v alue iteration are wasted on irrelc'ant states. 

The idea of the so-called Real Time Dynamie Programming (rtdp) methods 
[I] is to allow the solution of much larger problems by focusing the updates on 
the states that are likely to be relevant. This is achieved by using the current 
estimates P(s) (or V{b) in POMDPs) to guide a greedy search while limiting the 
updates to the states that are visited. Interestingly, repeated trials of this greedy 
search algorithm with updates, ev entually delivers an optimal policy gven if a 
large fraetion of the states is never or seldom visited [18,1,2]. For this reason 
RTDP algorithms can scale up to much larger problems pro vided good initial 
estimates are a vailable. 

The RTDP algorithm for SEARCH and MDPs is shown in Fig. I. Note that RTDP 
is basically a greedy or hill-elimhing algorithm that from any state s searches 
for a goal state using estimates V (s) of the expeeted eost to reach the goal. The 
main difference with standard hill-climbing is that these estimates are updated 
dynamically . Initially U (s) is set to h(s)^ where h is a suitable heuristie funetion^ 
and every time an action a from s is taken, V (s) is updated to mak e it consistent 
with the estimates of its successor states (Step 3 in Fig. I, Equation 5). The 
updates, guarantee that in any single trial the algorithm will ev eitually hnd the 
goal, and that after repeated trials.^ the cost estimates will ev entually convrge 
to their optimal v alues, provided that the heuristic function/^ used to initialize 
the estimates is admissible [1, 18]. 
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In the implementation, the estimates V{s) are stored in a hash table that 
initially contains an estimate V(sq) for sq only. Then when the value of a 

state s' that is not in the table is needed, a new entry with V{s') set to h{s'^ is 
created. 

RTDP is an ‘anytime’ algorithm for solving MDPs in the sense that after any 
number of steps and trials, the hash table along with the heuristic function 
determines a greedy control policy that eventually converges to the optimal 
policy when h is admissible. Since POMDPs correspond to MDPs over belief states, 
RTDP can be used to solve POMDPs as well by simply substituting the current 
state s in the RTDP loop in Fig. 1 by the current belief state 6, and the observed 
state s' by the new belief state 6^ (Equations 1-3), where o is the observation 
obtained. Problems of open-loop planning can be solved in a similar way, as 
they are a special case of problems of closed-loop planning with incomplete 
information in which there are no observations, and hence 6^ = ba- This includes, 
for example, the problems considered by the probabilistic planner BURIDAN [19]. 



4 Representation 



SEARCH, MDP and POMDP models are useful for analysis but not for modeling. 
In AI it has been a standard practice to model planning problems by means of 
high-level languages such as Strips [15]. In recent years similar languages have 
been dehned for modeling probabilistic actions [19, 12] and general POMDPs [16]. 
We illustrate the latter with a problem of planning with incomplete information 
due to Levesque [20]. It involves an agent that has a large supply of eggs and 
whose goal is to get three good eggs and no bad ones into one of two bowls. The 
eggs can be either good or bad, and at any time the agent can hnd out whether a 
bowl contains a bad egg by inspecting the bowl. In [16] this problem is encoded 
by expressions such as the ones in Fig. 2, which are automatically compiled into 
a POMDP, whose solution obtained by the RTDP algorithm provides the desired 
plan. 

Such a language extends Strips in several ways: states are not associated 
with sets of atoms but with assignments to arbitrary fluents; probabilities, costs 
and primitive operations like ‘+’ are included, and a special predicate obs is 
used to indicate observability. The fluents in this problem are the number of 
good eggs and bad eggs in each bowl {ngood{a)^ ngood{h)^ nbad(a)^ nbad(b))^ 
and the boolean variables holding? and good? that represent whether the agent 
is currently holding an egg and whether such an egg is good. The fluent hejlding 
is always observable, but the value of the expression nbad(nbowl) > 0) is only 
observable after doing the action inspect. For the formal syntax and semantics 
of this language, see [16]. 
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Action: grab-egg() 

Precond: -^holding 
Effects: holding true 

good^ (true 0.5 ; false 0.5) 

Action: break-egg : BOWL) 

Precond: holding A {ngood{bowl) + nbad{bowl)) < 4 
Effects: holding false 

goodl ngood(bowl) ngood{bowl) + 1 

-tgoodl nbad{bowl) nbad{bowl) + 1 

Action: pour(51 : BOWL, b2 : BOWL) 

Precond: (51 ^ 52) A -^holding 

ngood{bl) + nbad{bl) + ngood{b2) + nbad{b2) < 4 
Effects: ngood{bl) 0 , nbad{bl) 0 

ngood{b2) ngood{b2) + ngood{bl) 

nbad{b2) nbad{b2) + nbad{bl) 

Action: clean(bowl:BO WL) 

Precond: -^holding 

Effects: ngood{bowl) 0 , nbad{bowl) 0 

Action: inspect (5oti;/ : BOWL) 

Effect: ohs {nbad{bowl) > 0) 

Fig. 2. High-level encoding of Omelette Problem 



5 Results 

5.1 Classical Planning 



Tables 3 and 4 show results comparing RTDP with two receit and powerful 
planners, GRAPHPLAN [3] and SATPLAN [17], over the suite of problems in [17]. 
These tables are from [6], except that the RTDP column shows the result of 
the algorithm with no look ahead? The heuristic used is very informative but is 
not admissible [6]. F or this reason the algorithm does not improevmuch after 
successive trials, and hence, only a single trial is considered. 

As it can be seen from the tables, RTDP reaches the goal very fast (Fig. 1), but 
the length of plans is sometimes far from optimal (Fig. 2). One way to decrease 
the average length of plans is thus to run the algorithm several times keeping 
only the best run. Other methods are the addition of ‘noise’ in the selection of 
actions (see [6]), increased lookahead, and variations on the heuristic function. In 
principle RTDP can produce optimal plans, but cannot guaran tee the optimalit y 
of the plans produced as planners such as GRAPHPLAN. 



^ The algorithm in [6] is referred to as ASP for Action Selection for Planning, and is 
presented as a variation of Korf’s LRTA* [18], which is the deterministic version of 
RTDP. 
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Problem 


GRAPHPLAN 


SATPLAN 


RTDP 


rocket_ext.a 


268 


0.17 


1.3 


logistics. a 


5,942 


22 


7 


logistics. b 


2,538 


6 


6 


12 blocks 


1,119 


18 


1 


15 blocks 


— 


524 


5 


19 blocks 


— 


4,220 


19 



Fig. 3. Time performance in seconds of RTDP in comparison with GRAPHPLAN and 
SATPLAN 



Problem 


GRAPHPLAN 


SATPLAN 


RTDP 


rocketjext.a 


34 


34 


35/28 


logistics. a 


54 


54 


64/57 


logistics. b 


47 


47 


58/48 


12 blocks 


9 


9 


16/12 


15 blocks 


— 


14 


24/19 


19 blocks 


— 


18 


32/25 



Fig. 4. Quality performance. Averages and minimal plan lengths over 25 runs shown 



5.2 Planning with Incomplete Information 

Figures 5 displays the performance curve for the Omelette problem discussed 
above. The curve that is flat shows the average number of actions to solve the 
problem for the obvious plan in which an egg is grabbed and broken it into 
one of the two bowls, and after inspecting the bowl, it’s either passed to the 
other bowl or discarded, until three eggs have been passed. The other curve 
shows the performance of the greedy controller produced by RTDP after different 
number of trials. The heuristic function used in this case is admissible and follows 
from assuming that the next state will be observable [9, 16]. The convergence 
takes more than 1000 trials, as the algorithm has to Team’ the value of the 
action ‘inspect’, which as all information-gathering actions, appears useless to 
the heuristic. The time for 2000 trials in this problem is in the order of 192 
seconds on an UltraSparc running at 143Mhz. 

The second problem is originally from [10] where it is presented as a challeng- 
ing problem for contingent planning. It deals with a robot that is inside a room 
where there is a table, two boxes, a pile of red things by the door, and a key 
that may be in either of the two boxes. The goal is to get one red thing outside 
the room. In order to get out of the room the key must be placed by the door. 
The robot cannot hold two things at the same type, and while it knows whether 
it’s holding something or not, it does not know what it is holding. The resulting 
POMDP for this problem involves 480 states, 21 actions, and 6 observations (see 
[5] for a complete high-level description). An optimal plan for this problem is 
“go to the door, pickup a red thing, leave it on the table, then go for the key in 
one of the boxes and place it by the door; hnally, go to the table, pick up what’s 
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Omelette Problem 




Fig. 5. Comparison of RTDP policy vs. handcrafted plan for the ‘Omelette’ problem 



The Keys and Boxes Problem 




Fig. 6. Comparison of RTDP policy vs. handcrafted plan for ‘Key and Boxes’ problem 



in there and go outside”. Other plans are possible but most plans are no good 
as they leave the robot in a state of knowledge from whic h the goal can’t be 
achieved. The average cost of this plan is shown by the flat curve in Fig. 6. The 
other curve shows the performance of the RTDP controller as a function of the 
num ber of trials. The average time to compute the first 60 trials is 561 seconds. 
No times for the solution are reported in [10]. 

6 Discussion 

SEARCH, MBPS and POMDPs are key models for understanding v arious forms of 
open and closed-loop planning. Effective planning, however, also requires suitable 
languages and algorithms. F rom this perspectiv e, the planning languages are a 
convenient way for dehning these models and rev ealing their structure so that 
it can be used by suitably defined heuristics. We have also considered a simple 
and general RTDP algorithm that can be used for open and closed-loop planning 
with either complete or incomplete information, and presented some empirical 
results. 



MDP 
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models and RTDP algorithms are closely related to the models and algorithms 
used in Reinforcement Learning [29]. Indeed, reinforcement learning algorithms 
are currently understood as algorithms for solving MDPs by trial and error with 
no prior knowledge of cost and probabilities [32, Ij. The Q-learning algorithm in 
particular [32] is basically the model-free version of RTDP. For two recent books 
on MDPs and Reinforcement Learning; see [31] and [2]. 

Acknowledgement. My work on planning and MDPs has been in collaboration 
with Blai Bonet. Partial support is due to Conicit, Grant S 1-96001365. 
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1 Premise. At the frontier of a millennium: The challenge 

Will the “representational paradigm” - that characterised Artificial 
Intelligence (AI) and Cognitive Science (CS) from their very birth - be eliminated in 
the 21st century? Will this paradigm be replaced by the new one based on dynamic 
systems, connectionism, situatedness, embodiedness, etc.? Will this be the end of the 
AI ambitious project? I do not think so. Challenges and attacks to AI and CS have 
been hard and radical in the last 15 years, however I believe that the next century will 
start with a renewed rush of AI and we will not assist to a paradigmatic revolution, 
with connectionism replacing cognitivism and symbolic models; emergentist, 
dynamic and evolutionary models eliminating reasoning on explicit representations 
and planning; neuroscience (plus phenomenology) eliminating cognitive processing; 
situatedness, reactivity, cultural constructivism eliminating general concepts, context 
independent abstractions, ideal-typical models. I claim that the major scientific 
challenge of the first part of the century will precisely be the construction of a new 
“synthetic” paradigm: a paradigm that puts together, in a principled and non-eclectic 
way, cognition and emergence, information processing and self-organisation, 
reactivity and intentionality, situatedness and planning, etc. [Cas98a]. 

AI is going out of a crisis: crisis of grants, of prestige, and of identity. This crisis was 
not only due - on my view- to exaggerated expectations and overselling of specific 
technologies (like expert systems) tout court identified with AI. It was due to the 
restriction of cultural interests and influence of the discipline, and of its ambitions; to 
the dominance either of the logicist approach (identifying logics and theory, logics 
and foundations) or of a mere technological/applicative view of AI (see the debate 
about the ‘pure reason’ [McD87] and ‘rigor mortis’). New domains were growing as 
external and antagonistic to AI: neural nets, reactive systems, evolutionary computing, 
CSCW, cognitive modelling, etc. Hard attacks were made to the ’’classical” AI 
approach: situatedness [Suc87], anti-symbolism, reactivity [Bro89] [Agr89], dynamic 
systems, bounded and limited resources, uncertainty, and so on (on the challenges to 
AI and CS see also [Tha96]). 



‘ This is a preliminary version. Some of the sub-sections (such as “How to (partially) reduce social 
power to individual power”, in section 4; “Delegation” and “Conflict”, in section 5) were omitted due to 
space limitations of the whole text. 



Helder Coelho (Ed.): IBERAMIA’98, LNAI 1484, pp. 13-26, 1998. 
(c) Springer- Verlag Berlin Heidelberg 1998 
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However, by relaxing previous frameworks; by some contagion and hybridisation, by 
incorporating some of those criticisms; by re-absorbing as its own descendants neural 
nets, reactive systems, evolutionary computing, etc.; by developing important internal 
domains like machine learning and DAI-MAS; by important developments in logics 
and in languages; and finally with the new successful Agents framework, AI is now in 
a revival phase. It is trying to recover all the original challenges of the discipline, its 
strong scientific identity, its cultural role and influence. We may in fact say that there 
is already a neo-cognitivism and a new AI. In this new AI of the ’90s systems and 
models are conceived for reasoning and acting in open unpredictable worlds, with 
limited and uncertain knowledge, in real time, with bounded (both cognitive and 
material) resources, interfering —either co-operatively or competitively— with other 
systems. The new password is interaction [Bob91]: interaction with an evolving 
environment; among several, distributed and heterogeneous artificial systems in a 
network; with human users; among humans through computers. 

1.1 The synthesis 

Synthetic theories should explain the dynamic and emergent aspects of cognition and 
symbolic computation; how cognitive processing and individual intelligence emerge 
from sub-symbolic or sub-cognitive distributed computation, and causally feedbacks 
into it; how collective phenomena emerge from individual action and intelligence and 
causally shape back the individual mind. We need a principled theory which is able to 
reconcile cognition with emergence and with reactivity: 

Reconciling "Reactivity" and "Cognition" 

We shouldn’t consider reactivity as alternative to reasoning or to mental states 
[Cas95]. A reactive agent is not necessarily an agent without mental states and 
reasoning. Reactivity is not equal to reflexes. Also cognitive and planning agents are 
and must be reactive (like in several BDI models). They are reactive not only in the 
sense that they can have some hybrid and compound architecture that includes both 
deliberated actions and reflexes or other forms of low level reactions (for example, 
[Kur97]), but because there is some form of high level cognitive reactivity, the agent 
reacts by changing its mind: plans, goals, intentions. Also Suchman's provocative 
claims against planning are clearly too extreme and false. 

In general we have to bring all the anti-cognitivist claims, applied to sub-symbolic or 
insect-like systems, at the level of cognitive system \ 



‘ Cognitive agents are agents whose actions are internally regulated by goals (goal-directed) and whose 
goals, decisions, and plans are based on beliefs. Both goals and beliefs are cognitive representations that 
can be internally generated, manipulated, and subject to inferences and reasoning. Since a cognitive 
agent may have more than one goal active in the same situation, it must have some form of 
choice/decision, based on some "reason" i.e. on some belief and evaluation. Notice that 1 use "goal" as 
the general family term for all motivational representations: from desires to intentions, from objectives 
to motives, from needs to ambitions, etc. By "sub-cognitive" agents 1 mean agents whose behaviour is 
not regulated by an internal explicit representation of its purposes and by explicit beliefs. Sub-cognitive 
agents are for example simple neural-net agents, or mere reactive agents. 
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Reconciling "Emergence" and "Cognition" 

Emergence and cognition are not incompatible: they are not two alternative 
approaches to intelligence and cooperation, two competing paradigms. They must be 
reconciled: 

- first, considering cognition itself as a level of emergence: both as an emergence 
from sub-symbolic to symbolic (symbol grounding, emergent symbolic 
computation), and as a transition from objective to subjective representation 
(awareness) (see later for example on dependence and on conflicts) and from 
implicit to explicit knowledge’, 

- second, recognising the necessity for going beyond cognition, modelling 
emergent unaware, functional social phenomena (ex. unaware cooperation, non- 
orchestrated problem solving, and swarm intelligence) also among cognitive and 
planning agents. In fact, for a theory of cooperation and society among intelligent 
agents mind is not enough [Con96]. We have to explain how collective 
phenomena emerge from individual action and intelligence, and how a 
collaborative plan can be only partially represented in the minds of the 
participants, and some part represented in no mind at all [Hay67]. 

This is the most challenging problem of reconciliation between cognition and 
emergence: unaware social functions impinging on intentional actions. AI can 
significantly contribute to solve the main theoretical problem of all the social sciences 
[Hay67]: the problem of the micro-macro link, the problem of theoretically 
reconciling individual decisions and utility with the global, collective phenomena and 
interests. AI will contribute uniquely to solve this crucial problem, because it is able 
to formally model and to simulate at the same time the individual minds and 
behaviors, the emerging collective action, structure or effect, and their feedback to 
shape minds and reproduce themselves. Thus in the (formal and experimental) 
elaboration of this synthetic paradigm a major role will be played by AI, in particular 
by its agent-based and socially oriented approach to intelligence. 

2 Neo-reductionism and the micro-macro integration of 
scientific theories 

The real problem is the “integration” between different levels of description 
and/or explanation of reality; between different levels of granularity and complexity 
of systems. I claim that simple “compatibility” (non obvious contradiction) between 
principles and laws of one level and principles and laws of another level is only a 
minimal requirement. Much more is needed. We should systematically orchestrate 
one scientific layer (macro) with the other scientific layer, and one kind/level of 
explanation with a deeper level of explanation. 

I adopt a “neo-reductionist” perspective (as Miguel Virasoro defines it [Vir96]). Neo- 
reductionism postulates that from the number and the interactions of the elements of a 
complex system some behaviours will emerge whose laws can and must be described 
at such superior layer. Old reductionist position claims that the emerging level has no 
autonomy of description, that you cannot formulate specific concepts and laws; you 
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have just to explain it in terms of the underlying level: the only valid scientific 
position is to study the micro-units. By contrast, an anti-reductionist position will 
claim that the laws of the higher level of description (organisation) have nothing to do 
with the laws at the underlying level, and that there is no reason for searching a strong 
link between the two levels. The neo-reductionist position considers both approaches 
as necessary, by describing the laws typical of each level and investigating how from 
the basic level a complex behaviour can emerge and organise at the macro-level, and 
how it can possibly feedback into the micro-level (Virasoro does not take into account 
the feedback from macro to micro: he only considers the process of emergence -as 
typical of physics- while ignoring the process of “immergence” so relevant at the 
psychological and sociological levels). 

I claim that the integration between different description layers requires at least three 
devices. 

A) Cross-layer Theories - By 'Cross-layer Theories' I mean general models and laws 
valid at any or at least at several levels. The evolutionary framework for example can 
be successfully applied from molecules to species, cultures, organisations, ideas. 
Analogously, the system dynamics approach can be applied to weather, to atoms, to 
neurons, to animal populations, to market [Wei97]. 

B) Bridge-Theories. By 'Bridge-Theories' I mean theories that explicitly connect two 
levels of explanation, i.e. theories able to explain how a high level complex system 
works through (is implemented in) the micro-activities of its components; how 
complex phenomena emerge from simple behaviours; how the emerging global 
structure and behaviour feed-backs into and shapes the behaviours of its units. 

C) Layered Ontologies and Concepts. General broad notions are needed -applicable 
at different levels - but also level- specific definitions of the same notion are needed. 
For example we cannot have two independent notions of action, or of communication, 
one for simple reactive agents (for ex. for insects), the other for intentional agents (for 
ex. for human kind). We have to characterise the general features of 'action' or of 
'communication' and at the same time to have more specific notions for sub-cognitive 
and for cognitive agents. 

I will give some examples of all these different integrating devices in modelling 
agents and MAS . 

3 Cross-layer Theories 

Also in AI some principles are valid at different levels of granularity and 
complexity, both at the micro and at the to macro level. For example general 
principles of coordination, or general principles of search, and so on. I will shortly 
illustrate only one very important structure emerging at any Multi-Agent level 
independently of the granularity and cognitive complexity of the agents: the 
interdependence objective structure. 
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3.1 An emergent social structure: The dependence network 

The Dependence network [Cas92] [Con95] [Sic95] determines and predicts 
partnerships and coalitions formation, competition, cooperation, exchange, functional 
structure in organisations, rational and effective communication, and negotiation 
power. The dependence theory (and the related power theory - see 4.2) is a Cross- 
layer theory: it usefully applies to different level of agenthood and contributes to 
theoretically unify these levels. 

4 Bridge-Theories: the micro-implementation and counterparts 
of macro behaviours and entities 

Macro-level social phenomena are implemented through the (social) actions 
of the individual agents. In the case of cognitive agents, without an explicit theory of 
the agents’ minds that founds agents’ actions we cannot understand and explain 
several macro-level social phenomena (like team work, organizations), and in 
particular how they work. 

Let’s consider the individual counterparts of social norms, social power, 
organisational commitment and team-work. 

4.1 The mental counterpart and cognitive implementation of social norms 

Social norms are a multi-agent and multi-facets social object: in fact in order to work 
they should be represented in the minds of the involved agents, but these 
representations are not always the same: the agents play different normative roles and 
have different mental representations of the norms. Consider the addressee of the 
norm: it has to understand (believe) that there is a given expectation and prescription 
regarding its behaviour, and that it has to adopt this goal; but it has also to understand 
that this is not a personal or arbitrary request, but a ‘group will’, issued by some 
authority and in principle not aimed at personal interests. The addressee has also to 
believe that it is concerned by the norm and that a given act is an instance of the class 
of prescribed behaviours. The attitude and the normative mind of a "policeman”, i.e. 
of an agent entitled to control norm obedience, is different. And also different is the 
mind of a neutral observer or of the "legislator" issuing the norm [Con95]. In other 
words, a norm N emerges as a norm only when it emerges as a norm into the mind of 
the involved agents; not only through their minds (like in approaches based on 
imitation or behavioural conformity, ex. [Bic90]). In other words, it works as an N 
only when the agents recognise it as an N, use it as an N, "conceive" it as an N 
[Con95] [Cas98b]. Norm emergence and formation implies "cognitive emergence" 
(hence cognitive agents): a social N is really an N after its Cognitive Emergence ( CE) 
^ As long as the agents interpret the normative behaviour of the group merely as a 



^ When the micro-units of emerging dynamic processes are cognitive agents, a very important and 
unique phenomenon arises: the Cognitive Emergence (CE) [Con95] [Cas98b]. There is "cognitive 
emergence" when agents become aware, through a given ''conceptualisation'', of a certain "objective" 
pre-cognitive (unknown and non deliberated) phenomenon that is influencing their results and outcomes, 
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statistical ’’norm”, and comply by imitation, the real normative character of N remains 
unacknowledged, and the efficacy of such "misunderstood N" is quite limited. Only 
when the normative (which implies "prescriptive") character of N becomes 
acknowledged by the agent the N starts to operate efficaciously as an N through the 
true normative behaviour of that agent. Thus the ejfective "cognitive emergence” of N 
in the agent’s mind is a precondition for its social emergence in the group, for its 
efficacy and complete functioning as a N. 

4.2 Grounding joint action in group-commitment, social commitment, and 
personal commitment 

Many of the theories about joint or group action try to build it up on the basis of 
individual action: by directly reducing, for example, joint intention to individual 
intentions, joint plan to individual plans, group commitment (to a given joint intention 
and plan) to individual commitments to individual tasks. In [Cas97a] I claim that in 
this attempt the intermediate level between individual and collective action is 
bypassed; the real basis of all sociality (cooperation, competition, groups, 
organization, etc.) is missed: i.e. the individual social action and mind. One cannot 
reduce or connect action at the collective level to action at the individual level without 
passing through the social character of the individual action. 

It is right that we cannot understand and explain collaboration [Gro95], cooperation 
[Tuo93] [Tuo88] [Con95], teamwork [Lev90] without explicitly modelling -among 
cognitive agents - the beliefs, the intentions, plans, commitments of the involved 
agents. However the attempt to connect collective intentions and plans to individual 
intentions has been too direct and simplistic, in that some mediating level and object 
has been ignored: in particular social intentions and social commitments. 

How to reduce collective goals to individual goals 

In my view the most important point is that in joint activity (be it cooperation or 
exchange): 

• the agents do not have only beliefs about the intentions of the other agents, but 
they have positive expectations about the actions (and then the goals) of their partners. 
Expectations imply beliefs -i- goals about the actions (the goals) of the other: each 
agent delegates the partner to do part of the joint plan. So the social goal that the other 
intends to do a given action and does it, is the first basic ingredient of collaboration. 

On the other side: 

• each partner or member has to adopt (agree about) this delegation (task 
assignment), then she has the goal of doing her share not only because she shares a 



and then, indirectly, their actions. CE is a feedback effect of the emergent phenomenon on its ground 
elements (the agents): the emergent phenomenon changes their representations in a special way: it is 
(partially) represented into their minds. The "cognitive emergence" (through experience and learning, or 
through communication) of such "objective" relations, strongly changes social situation: from known 
interference, relations of competition/aggression or exploitation can rise; from acknowledged 
dependence, relations of power, goals of influencing and asking, possible exchanges or cooperation, will 
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common goal and/or a common plan, but also because she adopts the expectations of 
the other members and of the group (as a whole, as a collective-agent). She adheres to 
the explicit or implicit request of the group or of the partners. 

In several approaches basically derived from Tuomela and Miller’s theory of we- 
intentions [Tuo88] both the delegation goals, the expectations (not just beliefs) about 
the others’ intentions, and the G- adoption among the members are not explicit. 

Social Commitment 

Social Commitment is exactly such a relation of complementary individual social 
goals that results from the merging of a strong delegation and the corresponding 
strong adoption: reciprocal Social Commitments constitute the real structure of group 
and organizations. 

Also here, we need a notion of Commitment as a mediation between the individual 
and the collective one. There is a pre- social level of commitment: the Internal 
Commitment (I-Commitment) that corresponds to that defined by Cohen and 
Levesque (on the basis of Bratman’s analysis) [Coh90]. It refers to a relation between 
an agent and an action. The agent has decided to do something, the agent is 
determined to execute a certain action (at the scheduled time), the goal (intention) is a 
persistent one. 

A ’’social commitment” is not an individual Commitment shared by several agents. 
Social Commitment (S -Commitment) is a relational concept: the Commitment of one 
agent to another [Sin91] [Cas96]. It expresses a relation between at least two agents. 
More precisely, S-Commitment is a 4-argument relation: (S-COMM x y a z); where x 
is the committed agent; a is the action x is committed to do; y is the other agent whom 
X is committed to; z is a third agent before whom x is committed. Let us neglect the 
third agent (z), i. e. the witness. Here, I will focus on the relation between x and y. 

We should also distinguish S-Commitment from Collective Commitment (C- 
Commitment) or Group Commitment. The latter is just an Internal Commitment of a 
Collective agent or Group to a collective action. In other terms, a set of agents is 
Internally Committed to a certain intention and (usually) there is mutual knowledge 
about that. It remains to be clarified which are the relationships between S- 
Commitment and C-Commitment. 

x’^ I-Commitment on a is neither a necessary nor a sufficient condition for his S- 
Commitment on a. Just y’s belief that x is I-Committed to a, is a necessary condition 
of x’s S-Commitment. Anyway, postulating that our agents are always “honest” like in 
other models of Commitment, we may remark that the S-Commitment of x to y to a 
implies an I-Commitment of x to a. 

Relationships between Social and Collective Commitment 

In strictly ’’cooperative” groups (which in our sense are based on a Common Goal and 
Mutual Dependence), in team work, an S-Commitment of everybody to everybody 
arises: each one not only intends but has to do his own job. Given that the members 
form the group, we may say that each member is S-Committed to the group to do his 
share [Sin91]. 
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So, the C-Commitment (defined as the I-Commitment of a collective agent) will 
imply (at least in the case of a fully cooperative group): 

- the S -Commitment of each member to the group: x is S -Committed not simply to 
another member y, hut to the all set/group X he belongs to; 

- the S -Commitment of each member to each other; then also many Reciprocal 
Commitments; 

- the I-Commitment of each member to do his action. 

Joint Intentions, team work, coalitions (and what I call C-Commitments) imply S- 
Commitments among the members and between the member and her group, and S- 
Commitment (usually) implies I-Commitment. This is the bridge-theory connecting 
joint activity and individual mind. Although, the S-Commitment is not completely 
’’reducible” to the I-Commitment, because it is an intrinsically relational /social notion 
(among agents), and contains much more than the I-Commitment of the involved 
agents, the social level construct is clearly linked to the individual level construct. 
Notice also that the I-Commitment is a cross-layer notion. 

5 Layered Ontologies and Concepts 

Clearly we need the notions of communication, coordination, cooperation, 
social action, of conflict, deception, agenthood, etc. both for cognitive, intentional 
agents and for sub-cognitive, merely reactive agents. 

5.1 Agent 

The proliferating notion of ’’agent” - so crucial that it is transforming the whole AI 
and the notion of computing itself - waits for some ’’systematisation”. However, my 
claim is that what is needed is not a unique definition, which will be either too generic 
or too specific to computer science and technical. I think that what is needed is a 
system and, more precisely, a hierarchy of well-orchestrated definitions. 

The notion of agent (both the one we need, and the natural concept) is a layered one: 
there are broader notions (ex. any causal factor) and narrow notions (ex. intentional 
actors), but these notions are in a definable conceptual relation that can/should be 
made explicit. The problem is that there is not just one simple conceptual hierarchy. 
This is a heterarchical notion: a hierarchy with several independent roots. For ex. the 
dimension ”delegated”/”non delegated” is very important for the notion of ’’agent” in 
economics and in some AI domain. Another very important hierarchy to be clarified is 
that relative to ’’software” agents: a piece of software; an ’object’; an agent: what the 
relationships? Also other notions from system and control theory seem important. 

What is an agent? Elements for a definition 

The weakest and more basic notion of ’’agent” is that of a causal entity able to produce 
some change, some effect on a world, in an environment. This is the very poor 
meaning of physical forces as agents, or of ’’atmospheric agents” and so on. This 
capability to cause effects holds (as a component of the notion of ’’action”) in each of 
the more specific notions of agent. An agent can cause, can do something. More than 
this, when we conceive some causal force as ’’agent” we do not focus on the previous 
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possible causal chain; we focus on this cause as an initial causa prima, as independent 
if not autonomous. This notion of agent (relevant for example in semantics) is not 
sufficient either in AI or in Psychology. The notion starts to be interesting for 
Cognitive Science when the effects are no longer accidental or simply "efficient”: 
when the causal behaviour is finalistic (teleonomic) and becomes a true "action". This 
is the case of Teleonomic or Goal-oriented agents. 

Mere Goal-oriented Vs Goal-governed systems 

There are two basic types of system with finalistic (teleonomic) behaviour: intentional 
(more generally: goal-governed) and functional (mere goal-oriented). 

Goal oriented systems [McF83] are systems whose behaviour is finalistic, aimed at 
realising a given result which is not necessarily understood or explicitly represented 
(as an anticipatory representation) within the system itself. A typical sub-type of these 
systems are in fact Mere-Goal-oriented systems which are rule-based (production 
rules or classifiers) or reflex-, or releaser-, or association-based: they react to a given 
circumstance with a given behaviour (and they can possibly learn and adapt). 

Goal- governed systems are anticipatory systems. I call goal-governed a system or 
behaviour that is controlled and regulated purposively by an internally represented 
goal, a “set-point” or “goal-state” (cf. [Ros68]). The simplest example is a boiler- 
thermostat system. A “goal-governed” system responds to external functions through 
its internal goals. 

This is the basic notion of Agent of interest to AI: exploitable goal-oriented 
processes. This substantially converges with Franklin & Graesser’ definition [Fra96]. 
It is crucial to stress that mere goal-oriented systems and goal-governed systems are 
mutually exclusive classes, but that goal-governed systems can be also goal-oriented. 
Goal-government can be incomplete. It implements and improves goal-orientation, 
but it does not (completely) replace the latter: it does not make the latter redundant 
(contrary to Elster's claim [Els82] that intentional behaviour excludes functional 
behaviour - see later). So we have causal entities, teleonomic causal entities, and, 
among the latter, mere goal-oriented and goal-governed agents including also 
intentional agents. However, this is only one hierarchy. Another very important one is 
that based on delegation: AI agents in fact frequently need to work autonomously but 
'on behalf of the user or of some other agent. So one should distinguish between non- 
delegated and delegated agents, and between different kind of delegation (see later) 
and different kinds of autonomy. 

I propose to call Agenthood the capability to act; and to call Agency the capability to 
act under "delegation" of another agent. In this definition Agency presupposes 
Agenthood: only an agent in the full sense (goal-oriented) can be an agency, and 
necessarily it is delegated by (acts on behalf of) another agent. 

5.2 Social Action 

A SA is an action that deals with another entity as an agent i.e. as an active, 
autonomous, goal-oriented entity 

For cognitive agents, a SA is an action that deals with another cognitive agent 
considered as a cognitive agent, whose behavior is regulated by beliefs and goals. 
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[Cas97a]. In SA the agent takes an Intentional Stance towards the other agents: i.e. a 
representation of the other agent’s mind in intentional terms [DenSl]. 

An action related to another agent is not necessarily social. Also the opposite is true. 
A merely practical action, not directly involving other agents, may be or become 
social. The practical action of closing a door is social when we close the door to avoid 
that some agent enters or looks inside our room; the same action performed to block 
the wind (or rain or noise) is not social. Not behavioral dijferences but goals 
distinguish social action from non social action. 

We may call "weak SA” the one based just on social beliefs: beliefs about other 
agents’ minds or actions; and ’’strong SA” that which is also directed by social goals. 

It is common agreement in AI that ’’social agents” are equivalent to ’’communicating 
agents”. According to many students communication is a necessary feature of 
agenthood (in the AI sense) [Woo95] [Gen94] [Rus95]. Moreover, the advantages of 
communication are systematically mixed up with the advantages of coordination or of 
cooperation. Communication is an instrument for SA (of any kind: either cooperative 
or aggressive). Communication is also a kind of SA aimed at giving beliefs to the 
addressee. This is a true and typical Social Goal, since the intended result concerns a 
mental state of another agent. However, communication is not a necessary 
component of social action and interaction. To kill somebody is for sure a SA 
(although not very sociable!) but it neither is nor requires communication. Also pro- 
social actions do not necessarily require communication. In sum, neither agency nor 
sociality are grounded on communication, although, of course, communication is very 
important for social interaction. 

Strong social action is characterised by social goals. A social goal is defined as a goal 
that is directed toward another agent, i.e. whose intended results include another agent 
as a cognitive agent: a social goal is a goal about other agents’ minds or actions . 
Examples of typical social goals (strong SAs) are: changing the other’s mind, 
communication, hostility (blocking the other’s goal), strong delegation, adoption 
(favouring the other’s goal). In this case, we not only have Beliefs about others’ Beliefs 
or Goals (weak social action) but also Goals about the mind of the other: A wants that 
B believes something; A wants that B wants something. We cannot understand social 
interaction or collaboration or organisations without these social goals. Personal 
intentions of doing one’s own tasks, plus beliefs (although mutual) about others’ 
intentions (as used in the great majority of current AI models of collaboration) are not 
enough. 

Action and social action are possible, of course, also at the reactive level, among sub- 
cognitive agents (like bees). A definition of SA, communication, adoption, 
aggression, etc. is possible also for non-cognitive agents. However, also at this level 
those notions must be goal-based. Thus, a theory of merely goal-oriented (not ’’goal- 
directed”) systems and of implicit goals is needed. However, there are levels of 
sociality that cannot be attained reactively. Also at a sub-cognitive level, a SA is an 
action that deals with another entity as an agent, i.e. as an active, autonomous, goal- 
oriented entity. However the problem here is that there is not an agent’s mind for 
considering the other agent ’as an agent’. Subjectively the first agent acts as towards a 
physical entity: it just reacts to stimuli or conditions. So, in which sense its action is 
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social? in which sense it treats the other entity ’as an agent’? We cannot consider 
social any behaviour that simply accidentally affects another agent; teleonomy is 
needed: either the behaviour is intended (and this is not the case) or it is goal-oriented, 
functional to affecting another agent. 

5.3 Communication 

The same reasoning applies to communication (which in fact is a social action). As I 
said a definition of communication is possible also for non-cognitive agents. 
However, also at this level, that notion must be goal-based: either intentional or 
functional. We cannot consider as communication any information/sign arriving from 
A to B, unless it is aimed at informing B. For example, B can observe A - A not being 
aware of this - and can understand a lot of things about A, but A is not communicating 
with B or informing B about all those things. 

6 Constructing a Bridge 

It is necessary to discuss how we may advance towards a bridge between cognition 
and emergence, intention and function, autonomous goal-governed agents and goal- 
oriented social systems. 

6.1 Emergent forms of cooperation among cognitive agents 

Social cooperation does not necessarily need agents’ understanding, agreement, 
contracts, rational planning, collective decisions [Mac,98]. There are forms of 
cooperation that are deliberated and contractual (like a company, a team, an organised 
strike), and other forms of cooperation that are emergent: non contractual and even 
unaware. Modelling those forms is very important. Our claim [Cas97b] [Cas92] is that 
it is important to model them not just among sub-cognitive (using learning or 
selection of simple rules) [Ste90] [Mat92], but also among cognitive and planning 
agents whose behaviour is regulated by anticipatory representations. In fact, also these 
agents cannot understand, predict, and dominate all the global and compound effects 
of their actions at the collective level. Some of these effects are self-reinforcing and 
self-organising. 

• For instance in the case of hetero-directed or orchestrated cooperation, only a 
boss' mind conceives and knows the plan, while the involved agents may even ignore 
the existence of a global plan and of the other participants. 

• This is also the case of functional self- organising forms of social cooperation 
(like the technical division of labour) where no mind at all conceives or knows the 
emerging plan and organisation. Each agent is simply interested in its own local goal, 
interest and plan; nobody directly takes care of the task distribution, of the global plan 
and equilibrium. 
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6.2 Social Functions and Cognition 

Both objective structures and unplanned self-organising complex forms of social order 
and social functions emerge from the interactions of agents in a common world and 
from their individual mental states; both these structures and self-organising systems 
feedback into the agents behaviours through the agents’ individual minds either by the 
agent’s understanding the collective situation (cognitive emergence) or by 
constraining and conditioning agent goals and decisions. The aim of this section is to 
analyse the crucial relationship between social "functions” and cognition: cognitive 
agents' mental representations. I claim, in fact, that without a theory of emerging 
functions among cognitive agents, social behavior cannot be fully explained . 

In my view, current approaches to cognitive agent architectures (in terms of Beliefs 
and Goals) allow for a solution of this problem; though perhaps we need some more 
treatment of emotions. One can explain quite precisely this relation between cognition 
and the emergence and reproduction of social functions. In particular, functions install 
and maintain themselves parasitarely to cognition. For a Social Norm to work as a 
social norm and be fully effective, agents should understand it as a social norm. 
However the effectiveness of a social function is independent of the agents' 
understanding of this function of their behavior. In fact: 

a) the function can rise and maintain itself without the awareness of the agents; 

b) if the agents intend the results of their behavior, these would no longer be 
"social functions" of their behavior but just "intentions" [Els82]. 

I accept Elster's crucial objection to classical functional notions, but I think that it is 
possible to reconcile intentional and functional behavior. With an evolutionary view 
of "functions" it is possible to argue that intentional actions can acquire unintended 
functional effects. 

How to build unaware functions and cooperation on top of intentional actions and 
intended effects? How is it possible that positive results -thanks to their advantages- 
reinforce and reproduce the actions of intentional agents, and self-organise and 
reproduce themselves, without becoming simple intentions? [Els82]. This is the real 
theoretical challenge for reconciling emergence and cognition, intentional behavior 
and social functions, planning agents and unaware cooperation. We need more 
complex forms of reinforcement learning, not just based on classifiers, rules, 
associations, etc. but operating on the cognitive representations governing the action, 
i.e. on beliefs and goals. 

Functions are just effects of the behavior of the agents, that go beyond the intended 
effects and succeed in reproducing themselves because they reinforce the beliefs and 
the goals of the agents that caused that behavior. Then: 

• First, behavior is goal-directed and reasons-based; i.e. is intentional action. The 
agent bases its goal-adoption, its preferences and decisions, and its actions on its 
Beliefs (this is the definition of "cognitive agents"). 

• Second, there is some effect of those actions that is unknown or at least 
unintended by the agent. 

• Third, there is circular causality: a feedback loop from those unintended effects to 
increment, reinforce the Beliefs or the Goals that generated those actions. 
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• Fourth, this "reinforcement” increases the probability that in similar 
circumstances (activating the same Beliefs and Goals) the agent will produce the 
same behavior, then "reproducing" those effects. 

• Fifth, at this point such effects are no longer "accidental" or unimportant: 
although remaining unintended they are teleonomically produced [Con95, ch.8]: 
that behavior exists (also) thanks to its unintended effects; it was selected by 
these effects, and it is functional to them. Even if these effects could be negative 
for the goals or the interest of (some of) the involved agents, their behavior is 
"goal-oriented" to these effects. 

7 Towards Social Computing? 

I will conclude with the importance of the new “social” computational 

paradigm [Gas98], and with some doubts and questions about the ‘invisible hand’ and 

the ‘emergent character’ of computation in Agent Based Computing. 
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Abstract. This paper attempts to characterise a unifying overview of 
the practice of software engineers, AI designers, developers of evolution- 
ary forms of computation, designers of adaptive systems, etc. The topic 
overlaps with theoretical biology, developmental psychology and perhaps 
some aspects of social theory. Just as much of theoretical computer sci- 
ence follows the lead of engineering intuitions and tries to formalise them, 
there are also some important emerging high level cross disciplinary 
ideas about natural information processing architectures and evolution- 
ary mechanisms and that can perhaps be unified and formalised in the 
future. 



1 Introduction: Exploring Design Space 

AI can be construed as the exploration of the space of possible designs for (more 
or less) intelligent agents, whether natural or artificial [4,13]. Designs are not all 
static. Some systems change aspects of their own design: they modify themselves, 
through learning, adaptation, or architectural development, e.g. from embryo to 
infant, and from infant to adult. Brain damage or disease can also produce design 
changes with deleterious consequences. 

All these changes move a machine or organism from one region of design 
space to another. Possible routes through design space can be thought of as 
trajectories in the space. 

Some regions of design space are not linked by possible trajectories for in- 
dividual development. An acorn can transform itself into an oak tree, and by 
controlling its environment you can slightly modify what sort of oak tree (e.g. 
how big). But no matter how you try to train or coax it by modifying the envi- 
ronment, it will never grow into a giraffe. The acorn (a) lacks information needed 
to grow into a giraffe, (b) lacks the architecture to absorb and use such informa- 
tion, and (c) lacks the architecture required to modify itself into an architecture 
that can absorb the information. 

Trajectories that are possible for an individual which adapts or changes itself 
will be called i-trajectories. Different sorts of i-trajectories could be distinguished 
according to the sorts of mechanisms of change, e.g. innately determined devel- 
opment, reinforcement learning, facilitation by repetition, and various kinds of 
self-organising processes partly driven by the environment. 

Helder Coelho (Ed.): IBERAMIA’98, LNAI 1484, pp. 27-39, 1998. 
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Trajectories which are not possible for an individual machine or organism 
but are possible across generations of individuals subject to a particular type of 
evolutionary development will be called e-trajectories. Examples include devel- 
opment of humans and other animals from much simpler organisms and modifi- 
cations of software structures by genetic algorithms. Conjectured e-trajectories 
leading to human minds are discussed in [5] and [18]. 

Whether two designs are linked by an e-trajectory or not will depend on 
the type of evolutionary mechanism available for manipulating genetic struc- 
tures and the type of ontogenetic mechanism available for producing individuals 
(phenotypes) from genotypes. In biological organisms the two are connected: 
the ontogenetic mechanism can also evolve. Lamarckian inheritance would allow 
i-trajectories to form parts of e-trajectories. 

There are also some changes to individuals that are possible only through 
external intervention by another agent, e.g. performing repairs or extensions, or 
debugging software. Such changes follow r- trajectories (repair-trajectories). 

Viewing a species as a type of individual, e-trajectories for individuals form 
i-trajectories for a species, or a larger encompassing system, such as an ecology. 
Researchers in AI, Alife and evolutionary computation all contribute to the study 
of such trajectories. This study is in its infancy. 

2 “Semantics of Evolution” 

In computer science “semantics of computation” refers to abstract, mathemat- 
ical, properties of programming languages, data structures and the processes 
which can occur in virtual machines of various sorts. Likewise a study of the 
most general features of evolutionary trajectories in design space addresses a 
topic that could be called “semantics of evolution” (though both differ from the 
more common use of the word “semantics” in linguistics). 

Milner [9] noted that theoretical computer science follows the lead of engi- 
neering intuitions and tries to formalise them. Since the intuitions are often very 
subtle and complex the process of formalisation can lag far behind. Likewise 
attempts to study and formalise the space of possible designs and the various 
trajectories in design space will lag behind intuitive understanding gained from 
empirical research in biology, psychology, and computational explorations. This 
paper attempts to identify some of the phenomena to be formalised. 

Many have attempted to formalise features of evolution, individual learning, 
development etc. Kauffman [7] describes mathematical laws which constrain bi- 
ological mechanisms and processes in surprising ways. The ideas discussed below 
deal with phenomena which at present are too ill defined for mathematical for- 
mulation and computational modelling. 

3 Generalising Fitness Landscapes 

Evolutionary trajectories are often represented in a “fitness landscape” where 
a fitness value is associated with various locations in design space. If a class 
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of designs can be specified by two parameters (e.g. length and stiffness of a 
spring), then there is a 2-D space of designs, and adding a measure of fitness of 
the spring for a particular purpose, produces a 3-D fitness landscape. Typically, 
design spaces are far more complex than this, and cannot be specified by a 
fixed number of parameters, e.g. designs for Prolog compilers vary in structure 
and complexity. Moreover, many designs have no single fitness measure: Prolog 
compilers vary according to their portability, the speed of compilation, the speed 
of compiled code, the size of compiled code, the kinds of error handling they 
support, etc. 

A set of constraints and requirements for a design can be called a “niche”. 
Fitness of designs for organisms and artefacts may be compared in relation to a 
niche. Requirements for a compiler can vary just as the requirements for a plant 
or insect can. So there is a space of possible niches: “niche space” ([13,15,17]). 

Designs and niches are both abstract and can have different concrete instan- 
tiations. Designs don’t presuppose a designer and requirements (niches) don’t 
presuppose a requirer. There are different ways actual requirements can be gen- 
erated: e.g. engineering goals vs biological needs and pressures. 

Two insects in the same physical location can be in different niches, so niches 
are not determined by physical location. Neither are they simply in the eye of 
a beholder: niche-pressure can influence movement of individuals or a species in 
design space. This can happen in different ways. If the individual has adaptive 
capabilities it may move along an i-trajectory so as to fit a niche better. Alter- 
natively the pressure may cause a gene pool, or a subset of a gene pool, to move 
along an e-trajectory. 

There are many different sorts of causal relations: within an architecture, 
between architectures, between architectures and niches, between niches. Niches 
can interact with one another by producing pressure for changes in designs, 
which in turn can change niches, as in co-evolution of organisms. So there are 
trajectories in niche space as well as design space. 

Where independent changes in different dimensions are possible, a complex 
niche can cause parallel design changes, e.g. making an organism both physically 
stronger and better able to recognize complex physical structures. Problems arise 
when the changes are not independent: e.g. increasing agility may conflict with 
increasing strength. Where there are pressures for incompatible changes, which 
one actually occurs may depend on subtle features of the total context, and two 
identical individuals may be pushed along divergent trajectories because of slight 
differences in context. 

Feedback loops occur where changes in one individual or group alter the 
niche, and therefore the causal influences on another individual or group. Such 
feedback can lead to continual change, to oscillations, or to catastrophes. 

4 Simple and Complex Fitness Relations 

In the simplest case a design either fits or does not fit a niche. More generally 
the relation is more complex, and different designs may fit the same niche to 
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Fig. 1. Design space and niche space: Relations between designs and niches are 
complex and varied “fitness” descriptions, not numeric values. Trajectories are 
not shown. 



different degrees. Besides being simply better or worse, designs may fit a niche 
in different ways - e.g. different combinations of speed, robustness, flexibility, 
etc. The different styles of arrows in Figure 1 are intended to indicate this. 

If the fitness of designs solving a particular problem vary only in degree, then 
the search for a solution can be thought of as the search for a location in design 
space where the fitness value is highest. This is the familiar notion of a fitness 
landscape. A learning mechanism or an evolutionary process might pursue a 
trajectory in which the design is guided towards a fitness peak in the landscape. 
High peaks can be very hard to find. 

We can now generalise this idea in a number of ways. 

(a) Instead of a fixed niche determining the evaluation we consider a space of 
niches as well as the space of designs. A design can then be assessed in rela- 
tion to many possible niches, so that it does not have only one fitness value. 
Kauffman [7, pp 22 Iff] allows for this by mentioning that the fitness values for 
particular designs, and therefore the landscape, can change if objects in the en- 
vironment change. If those objects also have fitness landscapes then there are 
coupled fitness landscapes, each causing changes in the other. His notion of a 
fitness landscape changing corresponds to our notion of a design having different 
fitness descriptions in relation to different regions of niche space. 

(b) Instead of each niche determining an evaluation function which yields a 
numeric fitness, or even a total ordering of designs, it may determine a collection 
of incommensurable criteria for assessing designs (like speed and error handling 
in a compiler, or protection from predators and from cold in a house). In general 
the comparison of a niche and a design will yield not a number but a description 
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of the match ([8,11]). In simple cases this could be a vector of values. Sometimes 
there is a partial ordering of the descriptions, and sometimes not even that, 
because there is no way to combine the different dimensions of comparison. 
Design A might be better than B in one respect, B better than C in another 
and C better than A in a third. (Compare consumer reports.) Engineers are very 
familiar with such tradeoffs between designs. 

(c) Different designs and different niches can vary in complexity. So also do the 
descriptions of fitness. Assessment of a spelling checker will be much simpler 
than assessment of an operating system. 

(d) If fitness values are non-scalar, trajectories no longer lead uphill, downhill or 
horizontally. A path can lead to improvements in some respects and degradation 
in others. “Selection” becomes a problematic notion. 

(e) Having separate fitness values allows different sorts of selection to occur 
in sequence, simplifying the evolutionary design task by decomposing it, as an 
engineer might. When the changes required to improve different features are not 
independent there may be no useful decomposition. The “divide and conquer” 
approach is not always applicable: sometimes a creative new design is needed. 

(f) If identical individuals inhabit slightly different niches (e.g. because of differ- 
ent social roles or different neighbours) reproductive success might be favoured 
by different traits. E.g. in some farming communities physical strength may be 
more important than intelligence, whereas in a nearby industrialised region in- 
telligence is more useful for acquiring resources to raise a family. Thus different 
e-trajectories can be explored in parallel within a population exposed to differ- 
ent niches. Functional differentiation within a social system can accelerate this. 
Since motivation and performance are linked we can expect diversity of tastes 
as well as abilities. This may explain why we have individuals both able and 
willing to be concert pianists, steeplejacks, brain surgeons, etc. 

(g) Since designs have complex structures, a niche can change simply because of 
a change within a design, without anything changing in the environment. E.g. a 
change which increases running speed may alter energy requirements. So some 
aspects of a design determine requirements for other aspects. What you need is 
partly determined by what you’ve got. This can generate positive feedback loops 
driving designs along e-trajectories without any environmental changes. 

(h) The effects of niche pressures change in character when organisms develop 
cognitive abilities which enable them to recognize their own and others’ needs 
and abilities. If they can assess in advance the relevance of different physi- 
cal characteristics or behaviours to filling those needs, then improvements in 
a useful trait may be selected both directly through differing abilities to pro- 
vide for offspring, and indirectly through recognition of the trait by potential 
mates. Eugenic social policies are similar. Cognitive abilities can also influence 
co-evolution: if predators can tell in advance which prey are easier to catch, they 
can select victims in a herd. Thus recognition of signs of weakness can accelerate 
the elimination of weaker traits. So cognitive processes in organisms can make 
evolution and co-evolution more like a process of design. 
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From the standpoint described here, genetic algorithms which use a scalar 
“fitness function” are simply a special case. Moreover, in artificial evolution the 
designer often adds a separate selection function which uses the output of the 
fitness function. In natural evolution (and some Alife scenarios) selection and 
fitness are related in more subtle and varied ways. 

We have seen that when organisms have cognitive abilities this can make 
evolution more like a design process. Perhaps we should think of the biosphere as 
a sort of super-organism struggling to develop itself and through the intelligence 
of its products becoming more like a designer. 



5 Dynamics, Discontinuities and Inhomogeneities 

Since niches and designs interact dynamically, we can regard them as parts of 
virtual machines in the biosphere consisting of a host of control mechanisms, 
feedback loops, and information structures (including gene pools). All of these 
are ultimately implemented in, and supervenient on physics and chemistry. But 
they and their causal interactions may be as real as poverty and crime and their 
interactions. 

The biosphere is a very complex abstract dynamical system, composed of 
many smaller dynamical systems. Some of them are evanescent (e.g. tornados), 
some enduring but changing over diverse time scales (e.g. fruit flies, oak trees, 
ecosystems). Many subsystems impose constraints and requirements to be met 
or overcome by other subsystems: one component’s design is part of another 
component’s niche. Through a host of pressures, forces and more abstract causal 
relations, including transfer of factual information and control information, sys- 
tems at various levels are constantly adjusting themselves or being adjusted or 
modified. Some of the changes may be highly creative, including evolution of 
new forms of evolution, and new mechanisms for copying and later modifying 
modules to extend a design. 

These ideas may seem wild, but they are natural extensions of ideas already 
accepted by many scientists and engineers, e.g. [7,2]. 

Discontinuous e-trajectories. Both design space and niche space have 
very complex topologies, including many discontinuities, some small (e.g. adding 
a bit more memory to a design, adding a new step in a plan) some large (adding 
a new architectural layer, or a new formalism). Understanding natural intelli- 
gence may require understanding some major discontinuities in the evolutionary 
history of the architectures and mechanisms involved. This in turn may help us 
with the design of intelligent artefacts. 

I suspect discontinuities in design space occur somewhere between systems 
that are able merely to perform certain tasks, and others which can use gen- 
eralisations they have learnt about the environment to create new plans, i.e. 
between reactive and deliberative architectures. Discontinuities in e-trajectories 
can occur when an old mechanism is copied then modified: e.g. a mechanism 
which originally associates sensory patterns with appropriate responses could be 
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copied then used to associate sensory patterns with predicted sensory patterns 
or with a set of available responses. 

Discontinuities might also be involved in the evolution of the “reflective” 
abilities described below: not only being able to do X but having and being able 
to use information on how X was done, or why X was done, or why one method 
of doing X was used rather than another. (Compare [17,20].) What sorts of niche 
pressures in nature might favour such e-trajectories is an interesting biological 
question. 

Design space and niche space are “layered”: regions within them are de- 
scribable at different levels of abstraction and for each such region different 
“specialisations” exist. Some specialisations of designs are called implementa- 
tions. The philosopher’s notion of “supervenience” and the engineer’s notion of 
“implementation” (or realisation) seem to be closely linked, if not identical. 

Both are inhomogeneous spaces: local topology varies with location in the 
space, since the minimal changes possible at various locations in the same space 
can be very different in type and number. Consider designs of different complex- 
ity: there are typically more ways and more complex ways, of altering a complex 
design than a simple design. So they have neighbourhoods of different structures. 
By contrast, in most multi-dimensional spaces considered by scientists and engi- 
neers (e.g. phase spaces), each point has the same number of dimensions, i.e. the 
same number and the same types of changes are possible at all points (unless 
limited by equations of motion). 

Discontinuous i-trajectories. A system which develops, learns or adapts 
changes its design. I-trajectories, like e- trajectories can be discontinuous (e.g. 
cell division) and link regions in inhomogeneous spaces. The most familiar ex- 
amples are biological: e.g. a fertilised egg transforming itself into an embryo and 
then a neonate. In many animals, including humans, the information process- 
ing architecture seems to continue being transformed long after birth, and after 
the main physiological structures have been established: new forms of control 
of attention, learning, thinking, deliberating, develop after birth. Ontogeny may 
partly recapitulate phylogeny: but cultural influences may change this. 

Humans follow a very complex trajectory in design space throughout their 
lives. A good educational system can be viewed as providing a trajectory through 
niche space which will induce a trajectory in design space in self- modifying 
brains. A culture provides a set of developmental trajectories. 

In general, following a trajectory in design space also involves a trajectory in 
niche space: the niches for an unborn foetus, for a newborn infant, a schoolchild, a 
parent, a professor, etc. are all different. Moreover, an individual can instantiate 
more than one design, satisfying more than one niche: e.g. protector and provider, 
or parent and professor. To cope with development of multi-functional designs we 
can include composite niches in niche space, just as there are composite designs 
in design space. 
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6 Trajectories for Virtual Machines in Software Systems 



A distinction between i-trajectories and e-trajectories can be made for evolvable 
software individuals inhabiting virtual machines. A word processor which adapts 
itself to different users may or may not be capable of turning itself into a compiler 
through a series of adaptations. As in nature, there may be e-trajectories linking 
software designs that are not linked by i-trajectories. 

Whether an e-trajectory exists from one software design to another in an 
artificial evolutionary system depends on (a) whether there is a principled way 
of mapping the features of the designs onto genetic structures which can be used 
to recreate design instances via an instantiation (ontogenetic) function, and (b) 
whether the structures can be manipulated by the permitted operators (e.g. 
crossover and mutation), so as to traverse a trajectory in “gene space” which 
induces a trajectory in design space via the instantiation function. Whether some 
sort of evaluation function or niche pressure can cause the traversal to occur is a 
separate question [10]. E-trajectories can exist which our algorithms never find. 



7 Evolution of Human-Like Architectures 

We have argued in [14] and elsewhere {contra Dennett’s “intentional stance”) 
that many familiar mental concepts presuppose an information processing ar- 
chitecture. We conjecture that it involves several different sorts of coexisting, 
concurrently active, layers, including an evolutionarily old “reactive” layer in- 
volving dedicated highly parallel mechanisms each responding in a fixed way to 
its inputs. These may come from sensors or other internal components, and the 
outputs may go to motors or internal components, enabling loops. Some reac- 
tive systems have a fixed architecture except insofar as weights on links change 
through processes like reinforcement learning. Insects appear to have purely 
reactive architectures implementing a large collection of evolved behaviours. As 
suggested in [18], sophisticated reactive architectures may need a global “alarm” 
mechanism to detect urgent and important requirements to override relatively 
slow “normal” processes. This can interrupt and redirect other subsystems (e.g. 
freezing, fieeing, attacking, attending). 

A hybrid architecture, as shown in Figure 2, could combine a reactive layer 
with a “deliberative” layer which includes the ability to create new temporary 
structures representing alternative possibilities for complex future actions, which 
it can then compare and evaluate, using further temporary structures describing 
similarities and differences. This plan-construction requires a long term memory 
associating actions in contexts with consequences. After creating and selecting a 
new structure the deliberative system may execute it as a plan, and then discard 
it. Alternatively it may be able to modify itself permanently by saving some or 
all of the structure for future re-use. In humans the reactive architecture seems 
also to be extendable by re-use of plans, e.g. learning car driving or language 
comprehension may create new reactions. 
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Fig. 2. A hybrid reactive and deliberative architecture. A global “alarm” mech- 
anism could be added. See text. 



As before, a global alarm mechanism may be needed for coping with dangers 
and opportunities requiring rapid reactions. In mammals this seems to involve 
the limbic system, and emotional processes [3,6]. 

A deliberative mechanism will (normally) be discrete, serial, and therefore 
relatively slow, whereas a reactive mechanism can be highly parallel and therefore 
very fast, and may include some continuous (analog) mechanisms, possibly using 
thresholds. Resource limits in deliberative mechanisms may generate a need 
for an attention filter of some kind, limiting the ability of reactive and alarm 
mechanisms to interrupt high level processing. 

By analysing tradeoffs we may be able to understand how niche-pressures 
can lead to development of combined, concurrent deliberative and reactive ar- 
chitectures in organisms. 

Everything that can be done by a hybrid architecture could in principle be 
done by a suitably complex reactive architecture e.g. a huge, pre-compiled lookup 
table matching every possible history of sensory inputs with a particular com- 
bination of outputs. However, pre-requisites for such an implementation may be 
prohibitive: much longer evolution, with more varied evolutionary environments, 
to pre-program all the reactive behaviours, and far more storage to contain them, 
etc. For certain agents the universe may be neither old and varied enough for 
such development nor big enough to store all the combinations required to match 
a deliberative equivalent with generative power. Perhaps evolution “discovered” 
this and therefore favoured deliberative extensions for some organisms. 

A deliberative mechanism changes the niches for perceptual and motor mech- 
anisms, requiring them to develop new layers of abstraction, as indicated in 
Figure 2. Likewise, development of new, higher level, abstractions in perceptual 
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Fig. 3. A met a- management layer adds self monitoring, self-evaluation, etc. 
Three classes of emotions correspond to alarm mechanisms (not shown, to save 
clutter) acting on the three layers. 



and motor systems may change the niches for more central mechanisms, e.g. 
providing new opportunities for learning and simplified planning. 

Met a- management. Reflection on and retrospective evaluation of actions 
can often lead to future improvements. This is also true of internal actions. Thus 
besides abilities to perceive the environment and how external actions change it, 
there is a use also for internal self-monitoring, self-evaluation, self-modification 
(self-control) applied to internal states and processes. This could explain the 
evolution of a third architectural layer, as indicated in Figure 3. 

Sensory “qualia” arise in self-monitoring mechanisms with access to interme- 
diate sensory information structures not normally attended to. Different kinds 
of sensory qualia depend on different perceptual abstraction layers. Such “self- 
knowledge” is distinct from normal perception providing knowledge about the 
environment. “Meta-management” capabilities produce other sorts of qualia re- 
lated to thinking processes, deliberation, desires, etc. 

Robots with these capabilities might begin to wonder how their mental pro- 
cesses are related to their physical implementation, just as human philosophers 
do. Some of them, not fully understanding the notion of virtual machine func- 
tionality and the varieties of forms of supervenience, might even produce spuri- 
ous but convincing arguments that they have conscious processes which cannot 
be explained by or fully implemented in physical processes. They may wonder 
whether humans are actually zombies with all the behavioural capabilities of 
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conscious robots, but lacking their consciousness. I believe this solves the so- 
called “hard” problem of consciousness, see [1]. (Earlier work exploring these 
ideas can be found in [12,13,14,16,17,19]) 

Such agents (with a combination of reactive, deliberative and self- 
management sub-architectures) may combine to form social systems. Questions 
about trajectories in design space and niche space arise for social systems also. 
Human social systems develop information and rules which are transmitted to 
individuals, including rules that control met a- management (e.g. through guilt). 

Accounting for all this needs a theory embracing computer science, theoreti- 
cal biology, AI, psychology, neuroscience, anthropology, sociology, etc. Good soft- 
ware engineers and AI researchers are beginning to develop new intuitions about 
some of these things and it should be possible to find mathematical constructs 
that capture those intuitions, as computer science follows software engineering. 



8 Further Work 



We need to find more precise ways of describing architectures, designs, niches 
and their causal interactions, to improve on the high level concepts used only 
intuitively at present. This will involve both abstracting from domain specific 
details, so as to replace empirical concepts with mathematical concepts, and 
also enriching our understanding of the details of the processes, so that we can 
characterise and model the dynamics. 

If the intuitive notions of niche, genotype etc. in biology can be made suf- 
ficiently precise to enable us to understand precisely the relationships between 
niches and designs for organisms, this may provide a better understanding of 
the dynamics and trajectories in biological evolution, including the evolution of 
evolvability. 

This could lead to advances in comparative psychology. Understanding the 
precise variety of types of functional architectures in design space and the vir- 
tual machine processes they support, will enable us to describe and compare in 
far greater depth the capabilities of various animals. We’ll also have a concep- 
tual framework for saying precisely which subsets of human mental capabilities 
they have and which they lack. Likewise the discussion of mental capabilities 
of various sorts of machines could be put on a firmer scientific basis, with less 
scope for prejudice to determine which descriptions to use. E.g. instead of ar- 
guing about which animals, which machines, and which brain damaged humans 
have consciousness, we can determine precisely which sorts of consciousness they 
actually have. 

We could also derive new ways of thinking about human variability and the 
causes and effects of mental illness, brain damage, senile dementia, etc. This 
could have profound practical implications. 
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Abstract. In this article we analyze the problem of searching the 
WWW, giving some insight and models to understand its complexity. 
Then we survey the two main current techniques used to search the 
WWW. Finally, we present recent results that can help to partially solve 
the challenges posed. 



1 Introduction 

The boom in the use of the World Wide Web (WWW) and its exponential 
growth are well known facts nowadays. Just the amount of textual data available 
is estimated in the order of one terabyte. In addition, other media, as images, 
audio and video, are also available. Thus, the WWW can be seen as a very large, 
unstructured but ubiquitous database. This triggers the need for efficient tools 
to manage, retrieve, and filter information from this database. This problem is 
also becoming important in large Intranets, where we want to extract or infer 
new information to support a decision process. This task is called data mining. 
We make the important distinction between data and information. The later is 
processed data that fulfills our needs. 

In this article we outline the main problems of searching the WWW and some 
partial solutions to them. We focus on text, because although there are tech- 
niques to search in images and other non-textual data, they cannot be applied 
(yet) in large scale. We also emphasize syntactic search. That is, we search for 
WWW documents that have some words or patterns as content, which may (in 
most cases) or may not reflect the intrinsic semantics of the text. Although there 
are techniques to preprocess natural language and extract the text semantics, 
they are not 100% effective and they are also too costly for large amounts of 
data. In addition, in most cases they work with well structured text, a thesaurus 
and other contextual information. 

We now mention the main problems posed by the WWW. We can divide 
them in two classes: problems of the data itself and problems of the user. The 
first are: 
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— Distributed data: due to the intrinsic nature of the Web, data spans over 
many computers and platforms. These computers are interconnected with 
no predefined topology and with very different bandwiths. 

— High percentage of volatile data: due to Internet dynamics, new computers 
and data can be added or removed easily. We also have relocation problems 
when domain or file names change. 

— Large volume: the exponential growth of the WWW poses scaling issues that 
are difficult to cope with. 

— Unstructured data: most people say that the WWW is a distributed hy- 
pertext. However, this is incorrect. Any hypertext has a conceptual model 
behind, which organizes and adds consistency to the data and the hyperlinks. 
That is hardly true in the WWW, even for individual documents. 

— Quality of data: the WWW can be considered as a new publishing media. 
However, there is, in most cases, no editorial process. So, data can be even 
false, invalid (for example, because is too old), poorly written or, typically, 
with many errors from different sources (typos, grammatic mistakes, OCR 
errors, etc.) 

— Heterogeneous data: in addition to having to deal with multiple media types 
and hence with multiple formats, when talking only about text, we also have 
different languages and, what is worse, different alphabets, some of them 
very large (for example, Chinese or Japanese Kanji). 

Most of these problems are not solvable by just software improvements. For 
example, the cross-language or bad quality issues. Those will not change (and 
it should not in some cases!) or imply changing working habits. The second 
class of problems are faced by the user. Given the above, there are basically two 
problems: how to query and how to manage the answer of the query. Without 
taking in account the content semantics of a document, it is not easy to precisely 
specify a query, unless it is very simple. On the other hand, if we are able to 
pose the query, the answer might be a thousand of WWW pages. How do we 
handle a large answer? How do we rank the documents? (that is, how we select 
the documents that really are of interest for the user). In addition, a single 
document could be large. How do we browse efficiently in such documents? 

So, the overall challenge is to, in spite of the intrinsic problems posed by 
WWW, circumvent all of them and answer the questions above, such that a 
good query could be sent to the search system, obtaining a manageable and 
relevant answer. The organization of this paper is as follows. We first describe 
and model the WWW. This is the first step to understand its complexity and 
being able to analyze possible solutions. Second, we outline the main ways used 
today to search the Web, giving some examples. Third, we outline several new 
results that should help in (partially) solving some of the problems outlined. 
Between them, we can mention compression techniques allowing random- access, 
use of available text structure, visual query languages and visual browsing. Some 
of the results presented are part of an Iberoamerican project funded by CYTED 
(AMYRI) which has as a goal the research and development of techniques and 
tools to search the WWW [13]. 
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2 Measuring and Modeling the WWW 

In an interesting article, already (sadly) outdated, Tim Bray [16] studied different 
statistical measures of the WWW. From simple questions as how many servers 
there are in the WWW to characterizing WWW pages. Currently, there are near 
one million servers, counting domain names starting with www. However, as not 
all WWW servers have the www prefix, the real number is higher. On the other 
hand, the number of independent institutions that have WWW information is 
much less, because many places have multiple servers. The exact percentage is 
unknown, but should be more than 30% which was the result back in 1995. 

The most popular formats of WWW documents are HTML followed by GIF, 
TXT, PS, and JPG, in that order. How is a typical HTML page? First, most of 
them are not standard, meaning that they do not comply with all the HTML 
specifications. In addition, although HTML is an instance of SGML, HTML doc- 
uments seldom start with a formal document type definition. Second, they are 
small, a few Kbs (around 6 to 10) and no images. The pages that do have images, 
use them for presentation issues as colored bullets and lines. The average page 
has between 5 and 15 references (links) and most of them are local (their own 
WWW server hierarchy). However, on average no external server points to it 
(commonly there are only are local links). In fact, in 1995, around 80% of the 
pages had less than 10 links to itself. 

The top ten most referenced sites sites are Microsoft, Netscape, Yahoo!, and 
top US universities. In those cases we are talking about sites being pointed by at 
least ten thousand places. On the other hand, the site with most external links 
is Yahoo!. In some sense, Yahoo! is the glue of the WWW. Otherwise, we would 
have many isolated portions (this is the case with most personal WWW pages). 
If we assume that each HTML page has 6Kb and there are 100 pages per server, 
for one million servers we have at least 600Gb of text. The real volume should 
be larger. 

Can we model the document characteristics of the whole WWW? We will 
make a first attempt. The first problem is the distribution of document sizes, 
which has been found to have self-similarity [19]. This can be modeled using 
a “heavy-tail” distribution. That is, the majority of documents are small, but 
there is non trivial number of large documents. This is intuitive for image or 
video files, but it is also true for HTML pages. The simplest “heavy-tail” distri- 
bution is called the Pareto distribution: the probability of a document of size x 
is where k and a are parameters of the distribution. For text files, 

a is about 1.36, being smaller for images and other binary formats. In fact, for 
less than 50Kb, images are the typical files, from there to 300Kb we have audio 
files, and over that to several megabytes we have video files. 

For text files, the second important thing is the number of distinct words or 
vocabulary of each document. We use the Heaps ^ Law [24]. This is a very precise 
law ruling the growth of the vocabulary in natural language texts. It states that 
the vocabulary of a text of n words is of size V = Kn^ = 0{n^)^ where K 
and P depend on the particular text. K is normally between 10 and 100, and f3 
is between 0 and 1 (not included). Some recent experiments [8,11] show that the 
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most common values for P are between 0.4 and 0.6. Hence, the vocabulary of a 
text grows sublinearly with the text size, in a proportion close to its square root. 

A first inaccuracy appears immediately. Supposedly, the set of different words 
of a language is fixed by a constant (e.g. the number of different English words 
is finite). However, the limit is so high that it is much more accurate to assume 
that the size of the vocabulary is 0{n^) instead of 0(1), although the number 
should stabilize for huge enough texts. On the other hand, many authors argue 
that the number keeps growing anyway because of the errors 

Another inconsistency is that, as the text grows, the number of different 
words will grow too, and therefore the number of letters to represent all the 
different words will be 0(log(n^)) = O(logn). Therefore, longer and longer 
words should appear as the text grows. The average length could be kept constant 
if shorter words are common enough (which is the case). In practice, this effect 
is not noticeable and we can assume an invariable length, independent of the 
text size. 

The third issue is how the different words are distributed inside each doc- 
ument. A much more inexact law is the Zipf^s Law [38,23], which rules the 
distribution of the frequencies (that is, number of occurrences) of the words. 
The rule states that the frequency of the Tth most frequent word is 1 /i^ times 
that of the most frequent word. This implies that in a text of n words with a 
vocabulary of V words, the i-th most frequent word appears n/ Hy{0)) times, 
where 

i=l ^ 

SO that the sum of all frequencies is n. The value of 0 depends on the text. In 
the most simple formulation, ^ = I, and therefore Hy{0) = O(logn). However, 
this simplified version is very inexact, and the case ^ > I (more precisely, be- 
tween 1.5 and 2.0) fits better the real data [8]. This case is very different, since 
the distribution is much more skewed, and Hy{0) = 0(1). 

The fact that the distribution of words is very skewed (that is, there are a 
few hundreds of words which take up 50% of the text) suggest a concept which 
is frequently used in full-text retrieval: the stopwords [31]. A stopword is a word 
which does not carry meaning in natural language and therefore can be ignored 
(i.e. made not searchable), such as "a", "the", "by", etc. Fortunately, the most 
frequent words are stopwords, and therefore half of the words appearing in a 
text need not be considered. This allows, for instance, to significantly reduce the 
space overhead of indices for natural language texts. 

3 Searching the WWW 

There are basically three different approaches to search the WWW. Two of them 
are well known and used frequently. The first, is to use search engines that index 
all the WWW as a full-text database. The second, is to use Internet directories 
(catalogues or yellow pages). The third and not yet fully available, is to search 
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the WWW as a graph. In the next paragraphs we outline and exemplify the two 
main approaches currently available. 

3.1 Search Engines 

Most search engines use the crawler-indexer architecture. Crawlers are pieces 
of software that traverse the WWW sending new or updated pages to a main 
server where they are indexed. That index is used in a centralized fashion to 
answer queries submitted from different places in Internet. The most large search 
engines in WWW coverage are Hotbot [3], Altavista [18], Northern Light [4], and 
Excite [1], in that order. According to recent studies the coverage of the WWW 
by these engines varies from 28 to 55% [5] or 14 to 34% [27], as the number 
of WWW pages is estimated from 200 to 320 million. More facts about search 
engines can be found in the last two references. 

The WWW pages found by the search engine are ranked, usually using the 
number of occurrences of the query on each page. In most cases this is effective, in 
others may not have any meaning, because relevance is not fully correlated with 
query occurrence. The user can refine the query by constructing more complex 
queries based on the previous answer. As the users receive only a subset of the 
answer (the first 10 to 100 matches), the search engine should keep each answer 
in memory, such that is not necessary to recompute it if the user asks for the next 
subset. Search engines user interfaces in addition to words, allow to filter pages 
by using boolean operators, and geographic, language or date segmentation. 

The main problem faced by these engines is the recollection of data, because 
of the highly dynamic nature of the WWW, saturated communication links and 
high loaded WWW servers. Another important problem is the volume of the 
data. Then, these schema may not be able to cope with WWW growth in the 
near future. 

There other several variants of the crawler- indexer architecture. Between 
them we can mention Harvest [15] which uses a distributed architecture to 
gather and distribute data, being more efficient. However, the main drawback 
is that needs the coordination of several WWW servers. Another variant is We- 
bGlimpse [29] that attaches a small search box to the bottom of every HTML 
page, and allows the search to cover the neighborhood of that page or the 
whole site, without having to stop browsing. We also have to mention search 
engines that specializes in specific topics. For example the Search Broker [28] 
or the Miner family [35]. Finally, we have the metasearchers. These are WWW 
servers that use several engines, collect the answers and unify them. Examples 
are Metacrawler [36] and SavvySearch [20]. 

3.2 WWW Directories 

The best example of WWW directories is Yahoo! [2]. Directories are hierarchical 
taxonomies (trees) that classify human knowledge. The main advantage of this 
technique is that if we find what we are looking for, the answer will be in most 
cases useful. On the other hand, the main disadvantage is that the classification 
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is not specialized enough and that not all WW pages are classified. The last 
problem is worse every day as the WWW grows. The efforts to do automatic 
classification in AI are very old. However, until today, natural language process- 
ing is not 100% effective to extract relevant terms from a document. Nowadays, 
classification is done by a limited number of people. 

3.3 Finding the Needle in the Haystack 

Now we give a couple of search examples. One problem with full-text retrieval is 
that although many queries can be effective, many others are a total deception. 
The main reason is that a set of words do not capture all the semantics of a 
document. There is too much contextual information that can be explicit or 
even implicit, which we understand when we read. For example, suppose that 
we want to learn oriental games as Shogi or Go. For the first case, searching for 
Shogi will give you very fast good WWW pages where we can find what Shogi 
is (a variant of chess) and its rules. However, for Go the task is complicated, 
because in opposition to Shogi, is not a unique word in English. We can add 
more terms to the query, as game and Japanese but still we are out of luck, as 
the pages found are almost all about Japanese games written in English where 
the common verb go is used. 

The following example taken from [9] explains better this problem, where 
the ambiguity comes from the same language. Suppose that we want to wind 
the running speed of the jaguar, a big South American cat. A first naive search 
in Altavista would be jaguar speed. The results are pages that talk about the 
Jaguar car, an Atari video game, a US football team, a local network server, etc. 
The first page about the animal is ranked 183 and is a fable, without information 
about the speed. In a second try, we add the term cat. The answers are about 
the Clans Nova Cat and Smoke Jaguar, LMG Entreprises, fine cars, etc. Only the 
page ranked 25 has some information on jaguars but not the speed. Suppose we 
try Yahoo!. We look at Science: Biology: Zoology: Animals: Cats :Wild_Cats 
and Science : Biology :Animal_Behavior. No information about jaguars there. 
We can try to do a more specific search, for example using Live Topics. However 
here we also have a shortage of topics, so searching by jaguar only returns cars 
or football teams. 

The lessons learned in the example are that search engines still return too 
much hay to find the needle while the directories do not have enough deepness 
to find the needle. So, we can use the following rules of thumb: 

— Specific queries: look at an Encyclopedia. 

— Broad queries: use directories. 

— Vague queries: use search engines. 

4 Improvements to Inverted Files 

Most indices use variants of the inverted file. An inverted file is a list of sorted 
words (vocabulary), each one having a set of pointers to the pages where it 
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occurs. As we mentioned before, a set of frequent words or stopwords are not 
indexed. This reduces the size of the index. Also, it is important to point out 
that a normalized view of the text is indexed. Normalizing operations include 
removal of punctuation and multiple spaces to just one space between each 
word, uppercase to lowercase letters, use of synonyms through a thesaurus, etc. 
For more information on Information Retrieval algorithms and data structures 
see [22]. 

State of the art techniques can reduce an inverted file to about 20% of the 
text (the Altavista index has around 200Gb and 16 DEC Alpha servers are used 
to answer the queries, each one with several processors and 8Gb -sic- of RAM). 
A query is answered by doing a binary search on the sorted list of words. If we 
are searching multiple words, the results have to be combined to obtain the final 
answer. This step will be efficient if each word is not too frequent. 

Inverted files can also point to actual occurrences. However, that is too costly 
in space for the WWW, because then each pointer has to specify a page and a 
position inside the page (word numbers can be used instead of actual bytes). On 
the other hand, having the positions we can answer phrase searches or proximity 
queries, by finding words that are after each other or nearby in the same page, 
respectively. 

Finding words starting with a prefix, are solved by doing two binary searches 
in the sorted list of words. More complex searches, like words with errors, arbi- 
trary wildcards or in general, any regular expression on a word, can be performed 
by doing a sequential scan over the vocabulary. This may seem slow, but the best 
sequential algorithms for this type of queries can achieve near 5Mb per second 
and that is more or less the vocabulary size for 1Gb. Then, for several Gbs 
we can answer in a few seconds. For the WWW is still too slow (around three 
minutes for the Altavista index) but not completely out of the question. 

Pointing to pages or to word positions is an indication of the granularity of the 
index. This can be less dense if we point to logical blocks instead of pages. In this 
way we reduce the variance of the different document sizes, by having all blocks 
to have roughly the same size. This not only reduces the size of the pointers 
(because there are less blocks than documents) but also reduces the number of 
pointers because words have locality of reference (that is, all the occurrences 
of a non- frequent word will tend to be clustered on the same block). This idea 
was used in Glimpse [30] which is the core of Harvest [15]. Queries are resolved 
in the same way in the vocabulary and then are sequentially searched in the 
corresponding block (exact sequential search can be done over 7Mb per second). 
Glimpse originally used only 256 blocks, which was efficient up to 200Mb for 
searching words that were not too frequent, using an index of only 2% of the 
text. However, tuning the number of blocks and the block size, reasonable space- 
time trade-offs can be achieved for larger document collections. In fact, in [11] we 
show that for searching words with errors we can have sublinear space and search 
time simultaneously. These ideas cannot be used (yet) for the WWW because 
sequential search cannot be afforded, as it implies a network access. However, 
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in a distributed architecture where the index is also distributed, logical blocks 
make sense. 

The last issue is compression. Inverted lists can be partly compressed, in 
particular list of occurrences with high granularity. In those cases, the list is 
a sequence of ascending integers where the differences are much smaller than 
the values itself. Therefore, we can store just the first value and a sequence of 
differences. This complicates a bit the query evaluation, but space gains are ob- 
tained. Compression can also be applied to the text. However, most compression 
schemes are context dependent. That is, to decompress we have to decompress 
the whole file. Nevertheless, by using Huffman byte-coding over words (not let- 
ters), compression ratios of 30% coupled with random access to the compressed 
file are achieved [32]. This compression technique can be combined with the 
logical block scheme, obtaining an 8-fold improvement over normal sequential 
search by searching the compressed query word over the compressed text. The 
improvements came from the fact that one third of the I/O is done and searching 
over a shorter file is much faster. 



5 Visual Query Languages 

Traditional systems used words and boolean operations (and, or, butnot) to 
retrieve information. However, common users many times are confused by these 
operators, partly due to how we use logical connectives in normal language. That 
problem still remains today, but search engines have improved the searching 
interfaces to make things more clear (for example, using “all of’ , “some of’ , or 
“none of’ the words). Another solution is to use a visual metaphor to represent 
the boolean operations. For example, a spatial relation, where the horizontal axis 
specify groups of words that must be together while the vertical axis specify that 
a least one of the groups must be present. 

Another way to enhance content queries is to use the structure of the doc- 
ument. For example, HTML structure. A query to find a word near an image 
can be much more precise than just searching the word. For this, the index must 
include structural elements, adding little space, as structure is usually sparse. 
There are several proposals for query languages over content and structure [14]. 
However, most of them are too complicated for the final user. This drawback can 
be circumvent by using a visual query language. This is very natural as structure 
is highly correlated with the layout of a document. So, specifying a word near 
an image using a palette of elements, is not too difficult. 

A proposed metaphor for a simple visual query language is given in [12]. 
What the user usually sees is a page of text. So our visual language will be 
a page where the structure is composed from a set of predefined objects and 
the content is written where we want to find it. Each structure element will 
be a rectangle with its name in the top (using the special name Text if it is 
a content element). Each content element is placed inside the rectangle. Union 
of queries are obtained by putting rectangles besides each other. Inclusion is 
obtained by placing rectangles inside rectangles. Eigure 1 shows a query where 
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the Text element a must be inside a chapter and should not include documents 
having the content element c. 



Chapter 



Text 


X 


a 


1 



X 




Fig. 1. Example of a visual query. 



6 Visual Browsing 

Most visual representations focus on some specific aspects. In text retrieval 
we can distinguish visualizations for a single document, several documents or 
queries. Most of the time only one of those elements is visualized. In the last 
years, several visual metaphors have been designed, for each element, describing 
next some of them. 



6.1 Text Visualization 

Possible text visualizations follow, in addition to the normal one, which is the 
text itself. 

— One possible view is a normal text window with an augmented scroll-bar 
(similar to [33] but in a different context) which has marks where the text 
positions appear in the document. The scroll-bar can be viewed as a complete 
compact view of the text [10]. 

— The text itself is fish-eyed zooming where the query occurs given some ad- 
jacent lines to understand the context. The number of lines can be modified 
by the user. We call this an elastic text [10]. 

— Only the text layout is given, in multiple columns [21]. Colored lines may 
indicate where the query occurs. 
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Fig. 2. Answer visualized by VIBE. 



6.2 Document Visualization 

Nowadays there are more than 20 proposals for visualizing a set of documents. 
The VIBE system [34] is based on user given points of interest of the query (using 
weighted attributes). These points (query words) act as gravitational forces that 
attract the documents according to the number of occurrences of each query (see 
Eigure 2). Documents are displayed with different icon types and sizes. 

Another metaphor for the document space based on inter-particle forces, 
as VIBE, is proposed in [17]. In [37] the document space is abstracted from 
a Venn diagram to an iconic display called InfoCrystal. One advantage of this 
scheme is that is also a visual query language. Visual tools in three-dimensions 
to handle the document space are presented in Lyber World [26]. Visualization 
of occurrence frequency of terms in different text segments of a document is 
presented in [25]. 

Now we present a more elaborate metaphor for manipulating and filtering 
an answer given by a set of documents [10]. Eigure 3 shows an instance of the 
visual analysis tool that we propose for advanced users. We use a “library” 
or “bookpile” analogy depending if the tool is used horizontally or vertically, 
because both are possible. Each document (seen as a book) is represented as a 
rectangle with a particular color, height, width and position into the set. Each 
one of this graphical attributes, including the order of the list, can be mapped 
to a document attribute (occurrence density, size, date, etc). In the example, 
the order and the color are mapped to the same attribute (for example, the 
creation date) . These mappings allow to study different correlations of attributes 
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on the document set, helping the user to select the desired documents. A select 
button allows to choose a document subset by using the mouse (the wide border 
rectangle in the example). A prototype is explained in [7] and available in [6]. 




Order Color Width Height View Select Options 



Fig. 3. Analyzing and selecting a document set. 

The mapping of the attributes is selected by the menu buttons below the book 
list. The way the books are seen can also be changed. The set of documents can 
be forced to fit into the window (as in the example) , presented using a predefined 
choice of maximal/minimal widths and heights (using a scroll-bar if bigger than 
the window). Another view is a fish-eye representation for large sets, focusing 
where the user wants (by clicking with the mouse the appropriate sector). 



6.3 Query Visualization 

The relations between query terms and the answer can also be visualized [10]. For 
example, a pie view showing the percentage of documents of the total database 
that were selected. Another possibility is to show the distribution of occurrences 
within terms of the query using boxes of different sizes. This view is useful to 
know what terms are the best filters in the given query. A third possibility, is to 
show the distribution of documents selected on the database logical or physical 
space. This view can show if there is any logical locality of reference associated 
with the query. 

7 Conclusions 

A careful integration of some of the new results presented here can help on 
partially solving the problems of searching the WWW. For example, a truly co- 
operative and distributed architecture similar to Harvest can diminish WWW 
traffic and extend the scalability of search engines. In each local index, compres- 
sion and logical blocks can be used, obtaining smaller indices and documents. 
This is one of the goals of the AMYRI project [13] mentioned in the introduc- 
tion. The first step will be a client-server architecture, followed by a distributed 
architecture for the search engine server. 
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Abstract. To answer this question, we propose a general model of coor- 
dination in Multi- Agent Systems. Autonomous agents first recognise how 
they depend on each other (they may need or prefer to interact about 
the same or different goals), and then, in the negotiation phase, exchange 
offers in the form of commissive speech acts. Finally, agents adopt social, 
interlocking, commitments if an agreement is reached. 

Joint plans are seen as deals and team activity as a special case of social 
activity in which, having agents the same common goal, every possible 
deal is profitable. Consequently, notions traditionally involved in Coop- 
erative Problem Solving such as help and joint responsibility are applied 
to any social interaction. Therefore, the answer to our question is yes. 



1 Introduction 

The main concern in Distributed Artificial Intelligence (DAI) is how to design 
interaction protocols so that agents coordinate their behaviour. In Multi- Agent 
Systems (MAS), autonomous agents are devised by different designers, and have 
individual motivation to achieve their own goal and to maximise their own utility. 
Thus, no assumptions can be made about agents working together cooperatively. 
On the contrary, agents will cooperate only when they can benefit from that 
cooperation. 

Most of the formal models presented in MAS are centered in analysing iso- 
lated aspects of the coordination problem, such as dependence nets [5,14], 
joint intentions [11,6], social plans [12], or negotiation models [17,18,10,15]. 
As far as we know, only Wooldridge and Jennings [21] have tried to represent 
the whole process in Cooperative Problem Solving (CPS) domains, where au- 
tonomous agents happen to have a common goal and then acquire social atti- 
tudes before forming a group. 

A more comprehensive coordination framework has been presented in [3]: 
Agents with probably disparate and even conflict goals reason about their de- 
pendence relations and exchange offers following “social” strategies until they 
reach a deal. The resulting conditional commitments oblige them to abide by 
the agreements in societies. Therefore, there is no need for the agents to swear 
to act as a group in a team-formation stage. 

This research has been supported by the Ministerio de Educacion y Cultura del 
Gobierno Espahol (EX 97 30605211). 
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The purpose of this paper is to prove that this model of coordination accounts 
for teams as well as for societies, and that the rules governing team action do 
not diverge from those controlling societies. We will illustrate that social notions 
traditionally related to CPS domains such as help or joint responsibility are 
applicable both to teams and societies. Therefore, all kind of groups can be 
represented with a single model. 

The remainder of the paper is structured as follows. In the second section 
we introduce our concept of autonomous agent; in the third section an analysis 
of dependence relationships is presented; in the fourth section, the negotiation 
process is described; finally, we define societies as the result of the coordination 
process and show that the terms of the agreements explain any “social” concept. 
For simplicity, the model is presented in two- agents task oriented domains. Due 
to space restrictions we are not defining the language in full here, but readers 
are referred to [2]. 

2 Autonomous Agents 

Agents are autonomous but probably no- autosufficient entities. Each agent is 
viewed as an independent “cognitive object” with his own beliefs, abilities, goal, 
and utility function. In our model goals are not fixed. Agents can compare and 
make decisions about plans and/or deals that satisfy different subgoals or that 
satisfy their goals only partially. This ability to relax one’s goal opens up new 
opportunities for agreement, and enlarge the space of cooperation. 

In order to model agents’ behaviour we use a branching tree structure [7] (as 
it is common in MAS literature [12,20]), where each branch depicts an alternative 
execution path, 7T\: Each node in the structure represents a certain state of the 
world and each transition an action. Formally, 7T\ = (sq, ..., Sj_i, a;, Sj, ...,Sn). The 
set of actions associated to a path is defined as act(7Ti) = {ai, ..., an-i}. We can 
identify goal/subgoal structures with particular paths through the tree, each leaf 
labeled with the utility obtained by traversing this path. Those leaves with the 
highest worth might be though of as those that satisfy the full goal, while others, 
with lower worth values, only partially satisfy the goal. 

The rationality of a behaviour is understood according to its utility in the 
scale of preferences given a maximising policy. Nevertheless, in order to avoid un- 
contextualised decisions, utilities are defined with regard to agents’ (sub)goals^. 
Following [13], 

Definition 1. The utility of a (sub)path for an agent is the difference between 
the worth of the (sub) goal achieved executing this path and its cost . Therefore, 
if GOAL(x, 0i), ACH(7Ti, (/>i), and cost(x, tti) = {a; : a; G tt; A Ag(aj,x)} 



utility(x, TTj) = worth(x, (j)\) — cost(x, tt\) 



^ Haddawy showed in [8] that for simple step utility functions, choosing the plan that 
achieves a goal leads to choose the plan that maximises utility. 
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Definition 2. The solution set for a goal: Sol(0j) = {7ri|ACH(7Ti, That is, 
Sol((/)i) defines the space of all possible solutions of 0j. These solutions are ordered 
according to their utility in the scale of preferences. 



3 Dependence Analysis 

In the first phase of the coordination process agents try to recognise how they 
depend on each other. This recognition stage is crucial since it expresses agents’ 
motivations and explains why they might be interested in coordinating their 
actions. Consequently, this dependence analysis establishes the rules under which 
deals are arranged and guides all the coordination process. 

First, we consider agents’ condition. 

Definition 3. An agent is autosufficient for a given goal according to a set of 
paths if each path in this set achieves this goal and the agent is able to execute 
every action appearing in it. 

Henceforth Sol(0j) = {tti, ..., TTp} and {tt^, tTx} C {tti, ..., tTh}, 
AUTOSUFFICIENT(x, {ttw, ttx}, 0i) iff 

Vtti e {ttw, ...,7Tx} Vai e act(7Ti) Ag(ai,x) 

On the other hand, there will probably other paths in the solution set of the 
goal such that the agent is not able to execute. For these paths, the agent is said 
to be deficient. 

DEFICIENT(x,{^w,-,^x},(/>i) iff 

VTTi G {tTw, ...,7Tx} 3ai G act(7Ti) ^Ag(ai,x) 

We define now how agents depend on each other: N(x, y, {aj, ..., ak}, 0i) and 
W(x, y, {aj, ..., ak}, 0i) mean that x needs or “weak” depends on y with regard to 
{aj, ..., ak} for achieving (j)\. On the other hand, 0 and 0 are the “charge” of the 
relation: 0 means that the dependence relationship is about the execution of 
the actions related to -01, whilst 0 is about its omission. There are four possible 
basic dependence relations^ 

1. 0N(x,y,{aj,...,ak},0i) ^ 

VvTi G {tTi, ..., 7Tn}DEFICIENT(x, TTj, 0i)A 
35 \ = {a, {aj,...,ak} C act({7ri, ..., tTh})}, 

That is, an agent needs positively the other agent to execute a set of actions 
if and only if he is not able to execute any subpath in his goal’s solution set 

^ For simplicity, we have constrained the model to one relation each time, but agents 
can depend on each other in many different intermixed ways. For example, two agents 
with the same goal can need each other about two subpaths and depend weakly on 
one another about the rest of the path at issue. Moreover, different relations can 
arise if alternative paths are taken into consideration. 
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and there exists a deal such that y executes a subset of the actions associated 
with this goal. 

This is a purposely vague definition, because the actions involved in the 
deal depends on how y is affected by x. The important thing is to realise 
that X needs y not only because of his own inefficiency but also because 
there is space for cooperation. The deal guarantees the required space of 
cooperation since every deal is supposed to be individual- rational, that is, 
it must improve both agents’ position. 

2. ©W(x,y, {aj,...,3k},?!<i) ^ 

AUTOSUFFICIENT(x, {ttw, 7r><}, 

35i = {a, {aj, ak} C act({7Ti, 7Tn})}, 

{aj, ...,ak} can be the actions associated with any path in the solution set. 
The agreed path will be a subset of a path satisfying x’s goal. The only 
condition is that utility(x, tt^') > utility(x, tt; G {tTw, tTx}). 

3. ©N(x,y,{aj,...,ak},0i) ^ 

AUTOSUFFICIENT(x, {ttw, ttx}, 

INFIIBIT({aj, ..., ak}, {ttw, ...,7Tx})A 
35\ = {a, ^{aj, ..., ak}}, 

where a set of actions inhibits a set of paths if and only if there is no extension 
of these paths containing members of the set of actions. Probably, stand- 
alone paths are preferred to the ones resulting from the deal, but x has 
no choice. Therefore, {aj,...,ak} must be optional in y’s goal solution set. 
Otherwise, there is an open conflict. Usually, these actions will be adopted 
to be used as threats. 

4. ©W(x,y, {aj,...,ak},</>0 ^ 

AUTOSUFFICIENT(x, {ttw, 4>\)A 

INFIIBIT({aj,...,ak},{’rj,..-,7rk} C {ttw, 7Tx})A 
35i,(5i = {a, -.{aj,...,ak}}, 

In this case, the set of actions inhibits some of the autosufficient paths. 
However, there is a deal from which the inhibited paths can be executed and 
whose utility is greater than the utility of non-inhibited paths. 

One significant feature of our model is that only bilateral relations are al- 
lowed. Unlike in [5,14,21], an agent cannot act in society according exclusively 
to its individual needs or preferences, or offer deals to achieve his goal without 
taking into account others’ motivations. We can analyse the space of interaction 
according to the “charge” of the relations as follows: 

1. We can say that social interaction takes place in three possible cases (a 
similar approach is presented in [13]): 

(a) a cooperative situation is one in which each agent welcomes the existence 
of the other agent. That is, when they depend positively on each other. 
This is usual in mutual relations (when agents share the same goal) be- 
cause it is always profitable for both agents to share the load of executing 
the associated plan; 
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(b) a compromise situation is one in which both agents would prefer to be 
alone (they depend on each other negatively). However, since they are 
forced to cope with the presence of the other, they will agree on a deal. 
Typically when one agent’s gain always entails the other’s loss; 

(c) a neutral situation is one in which one agent is in a cooperative situation 
and the other is in a compromise one. 

2. On the other hand, we say that social co- action happens in two circumstances 

(a) a conflict situation is one in which agents come across but there is no deal 
that resolve their possible interactions. For example, when agents have 
parallel goals or in killer- victim relations. In such cases, although they 
do depend on each other, their relation is not subject to coordination. It 
is true that they need to reason about each others’ behaviour (the victim 
will try to anticipate the killer’s behaviour in order to escape, and vice 
versa) but the nature of their dependence is a-social; 

(b) agents are independent if there is at least one of them whose goal is not 
affected by other’s goal. As an example, we have the parasite agent, who 
waits the other to achieve his goal. 

According to the “weight” of each agent in the interaction, there are two 
types of social interaction: Symmetric situations, SYM, in which both agents 
need or “weak” depend on each other; or Asymmetric situations, ASYM, in 
which X needs y, and this second agent only “weak” depends on the first. In 
that case, it is said that y has power (ROW) over x. Accordingly, an agent x 
has not power over other agent y just because y needs x (unlike in [4]), but the 
powerful agent must have some motivation to exploit his dominating status. If 
either of the agents is not interested in interacting, talking about dependence 
relationships is pointless. 

If we compare now our model with others in MAS literature, we have that 
we enlarge substantially the space of cooperation, as 

— we allow agents to negotiate in need or preference (only [21] studies both 
cases); 

— agents can negotiate not only about common goals, but about disparate and 
even conflict goals. This is because agents are allowed to relax their initial 
goals and negotiate about subgoals and/or degrees of satisfaction of their 
respective goals. Imagine two hunters trying to get the same hare. They have 
common compatible subgoals, to catch the prey, but two parallel final goals, 
to eat it. So, they will cooperate about that subgoal (because coordinating 
their attack increases their chances), and then compete openly; 

— deals can be about the execution or the omission of actions (the possibility 
of “non-negative contribution” is just pointed out in [9]). 

4 Negotiation Process 

Once agents have a model of the interaction situation, they exchange offers 
directly. Negotiation is a process through which in each temporal point one 
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agent, say x, proposes an agreement from the negotiation set (NS), and the 
other agent, y, either accepts the offer or does not. If the offer is accepted, then 
the negotiation ends with the implementation of the agreement. Otherwise, the 
second agent then has to make a counteroffer, or reject x’s offer and abandon 
the process. 

We are not introducing here a detailed model of the negotiation procedure 
(see [3,2]). In this paper our only concern is about those aspects of the model 
that will help us to show that groups are just societies of agents that share a 
goal. One of these aspects has to do with how joint commitments are understood. 
We consider that in order to avoid references to irreducible “social” notions the 
contents of joint intentions must be tracked throughout the coordination process, 
and that the conditions of individual social commitments must be expressed as 
arguments in the offers. 

We say that an agent x offers a deal if he requests the other agent y to be 
socially committed to execute some action and asserts that if y confirms such 
commitment, he will commit himself to execute another action. Formally (the 
speech acts operators come from [16]), 

1. OFF(x,y,5i = {«,/?}) = 

REQ(x, y, SOC — COM(y, x, /?, a)) A 

ASS(x, y, (CONF(y, x, SOC - COM(y, x, /?, a)) SOC - COM(x, y, a, P)). 

2. Using this definition, counteroffers are easily defined as a refusal followed by 

other offer. 

The content of such commitments must specify the social conditions under which 
these engagements persist or are dropped out. We say that an agent is socially 
committed with other agent to execute an action if he has the intention of 
executing it until he believes that his part of the deal is true or will never 
be true, when he adopts the goal of having this situation mutually believed. 
Moreover, an agent’s social commitment will be also abandoned if he believes 
that his pattern has failed in executing his part. In this case, the goal of having 
this fact mutually believed is interpreted as acknowledgement. 

Definition 4. SCOM(x, y, o, /5) =def 
UNTIL BEL(x,-SCOM -C(x,a,/3)) 

INT(x,o) 

WHEN GOAL(x,MBEL(x,y,^SCOM -C(x,o, /?))), 

Definition 5. SCOM — C(x, o,/5) = 

[BEL(x, a) A BEL(x, B^a) A BEL(x, 

where /? is other agent’s part. Therefore, social commitments are conditioned. 
This is because in MAS, where benevolence is not assumed, negotiation is only 
understood according to this quid pro quo policy. When an agent x makes an 
offer, the actions requested are those he believes he depends on y; on the other 
hand, the actions he promises to be committed to if y accepts the offer are those 
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actions he believes y depends on him. That is, negotiation steps are created ac- 
cording to dependence relationships. Thus, unlike in Wooldridge and Jennings’s 
proposal [21], there is no need for a team- formation phase. 

To what agreements can agents come? As we study asymmetric relations, 
the only a priori condition for a bargain to be in the space of deals is that it 
has to be individual rational. It would be “unfair” to ask a dominant agent to 
accept a Pareto- Optimal deal if he can obtain more profit from another deal. By 
individual-rational we mean that both agents must improve their position with 
the deal. So, if an agent has a stand-alone plan, the deal must not decrease his 
utility; if the agent has no such an alternative to the negotiated agreement, the 
deal must give him non-negative utility. The search for “fair” deals is presented 
in two ways: 

Strict Mechanism: Firstly, we define a “fairness” one-to-one function from 
the set of situations to the set of deals, f : SIT ^ NS, giving the values depicted 
in Fig.l. (where SYMN means symmetric necessity relation, and so on): 



l.f(SYMN(x,y,a,/3)) =5 


i = {a, 13} 




'<5i iff P-Opt{x,y}(NS) = 5i 


2.f(SYMW(x,y,a,/J)) = < 


<5i iff P-Opt{x’y}(NS) = 




r({zA})=’<5i 




' Ji iff greatestx(NS) = Ji 




S\ iff maximalx(NS) = {^}A 


3.f(POW(x,y,a,/3)) = < 


greatesty({A}) = 




Ji iff maximalx(NS) = {^}A 




flat(x, {Z\}) A r({zA}) = Ji 



Fig.l. The “fairness” function. 



1. If the agents are in a symmetric situation and they need each other, then 
the only “fair” solution consists in exchanging the actions involved. 

2. If the agents are in a symmetric situation and they depend weakly on each 
other, then the deal can be: 

2.1. Pareto- Optimal deal, if it is unique. 

2.2. If there are several Pareto- Optimal deals, then the resulting agreement 
will be a random variable among the set of “fair” deals. For example, when 
autosufRcient agents with a common goal meet, they may be indifferent 
about how tasks should be distributed. Or when there are two Pareto- 
Optimal deals implying an odd number of actions 

3. If one agent has power over the other, then the “fair” deal will be: 

3.1. The one maximising his utility, the greatest in his scale of preferences. 
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3.2. If it is not unique, then the “fair” deal will be the most preferred by the 
dominated agent among the dominant’s set of maximals. 

3.3. If the dominated agent is indifferent, then one of the deals in this set is 
chosen at random. 



Tolerant Mechanism: It is more useful to apply a tolerant mechanism in 
dynamic environments in which failures in the execution of the agreements will 
occur quite often. We introduce a set of ordered deals (Z\, <) representing the 
“fair” deal associated to each possible situation derived from the actual one. As 
a result, agents agree not only on the specific deal which is carried out at the 
first time, but also about the deals in reserve. So, in case of failure agents will 
use the following automatic rule: 

RENEG: In case of failure in the execution of a specific deal, eliminate it from 
the set of deals and apply the “fairness’ function according to the situation 
generated. 

Example 1. Imagine that the two agents share the same goal and that there is a 
SYMW relationship between them. The deal is, therefore, assumed to he a simple 
task distribution. We have the following set of possible deals 
Z\ = {{a, /?},{/?, q;}, {0, (q;, /?)}, {(a, /3), 0}}. As agents depend weakly on eaeh 
other, the “/azr” deal must be Pareto- Optimal, one of the two first deals in A. 
Imagine that {<a, /3} is ehosen by random and that FAILS(x, a). Then, there will 
be two possible deals left, namely {j3,Oi} and {0,((a,/3)}. Moreover, the relation 
between the agents has ehanged, and now y will have power over x. Therefore, the 
first deal, the one maximising y ’5 utility, is ehosen. We ean go one step further, 
and suppose that x fails again exeeuting (3. In this ease, the only possible deal to 
be earried out is {0, (a,/?)}. 

5 After Negotiation 

In our proposal each agent is still considered independent after negotiation. Joint 
plans are understood as deals through which agents make social commitments 
to execute particular actions, not to act as a group. We can define now joint 
commitments simply as the conjunction of the social commitments involved. 

Definition 6. JCOM(x, y, S\) =def SCOM(x, y, a, (3) A SCOM(y,x, f3, a) 

Societies are seen as groups of agents with a joint commitment to execute 
the agreed deal. For us, a group of agents forms a society when they reach an 
agreement, not when they decide to act as such and try jointly to reach an 
agreement. That is, the notion of society is a result of coordination and not its 
condition. 

Our approach explains certain unclear aspects of teamwork. It is common in 
MAS literature to refer to the team as a whole when there is no way of attaching 
individual attitudes to its members. 
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Example 2. Consider a team of two pilots whose goal is to carry as many he- 
licopters to a point as possible. Following the joint intentions framework it is 
sujficient if the team reaches that point, ech individual need not do so individu- 
ally. In so doing, the team is reified and the task distribution enigma arises [1]. 
We think that our approach is more natural. In this case, we have three possi- 
ble deals, {{(U, /^}, {(U, 0}, {0, /3}}^ where each part means an agent reaching that 
point. The order of preference is {<a, /3} > {<a, 0} = {0,/?}. The RENEG rule is 
applied automatically, each deal satisfying the goal to some degree: {a, P} sat- 
isfies completely the goal, as all the pilots reach the point, whereas 0} and 
{0,/3} satisfy the goal only partially. In any case, each agent achieves the goal if 
one of the deals is accomplished. 

We adopt this individualistic approach to stress the bargaining nature of 
social interactions, and that agents coordinate their behaviour and form societies 
with regard to common interests, not to common goals, as their motivations can 
be very disparate. From this point of view, agents do not agree and form groups 
to achieve a goal, but to execute deals that will achieve (perhaps partially) their 
(common or not) goals. 

Therefore if the negotiation protocol ends in agreement agents will adopt a 
joint commitment and form a society. 

Definition 7. SOCIETY({x, y}, (5j) =def 

MBEL(x, y, SIT (x, y, crj)) A 
MBEL(x,y,f(SIT) = ^i)A 

MBEL(x,y, (JCOM(x, y, (^i))) 

Of course, this notion of society is quite basic. Things are more complicated 
in real environments, where team action can involve notions of social justice 
and social welfare. In the end, groups will behave according to the ideology or 
interests of the designer. However, it is worth pointing out that we are working 
with systems (MAS) where the utilitarian point of view is closely related to 
liberalism. If notions of global utility are taken into account, agents will stop 
being autonomous, and some kind of community spirit will control our designs. 

6 Groups 

In this final section we exemplify how “help” and “joint responsibility” are un- 
derstood in our model, and conclude that, all in all, both concepts are applicable 
both to teams and societies. 

According to Tuomela [19] one of the most important notions of cooperative 
activity is the one of help, in the sense of (extra) actions strictly contributing to 
other participants performing their parts well. In MAS agents are not assumed 
to be benevolent and will therefore cooperate and help each other when they can 
benefit from that cooperation (that is, when the cost of “helping” actions does 
not exceed the gains accruing from them). So, as far as agents have common 
interests, they will keep executing RENEG. Since everything is arranged before 
execution starts, no action can be interpreted as altruistic help. 
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Why are groups so special? We have seen in Example 1 that when agents 
share a common goal their preferences are highly positively correlated. Whenever 
a collection of agents has the same goal, it is in their own interest to help each 
other. This is because in teams the problem of coordination is in practice the 
problem of how to distribute the goal tasks. Even if x is suddenly unable to 
execute any action, y will execute the entire plan, because he has nothing to 
lose: the deal will be equal to his stand-alone plan, the goal’s plan- structure. He 
cannot refuse to execute it because of x’s failure. Otherwise, he himself will not 
achieve the goal. Having a common goal they are in a situation in which they are 
destined to cooperate and act jointly. However, help is not unique to groups. In 
societies where agents have different goals, the renegotiation rule will be applied 
until the corresponding deal is not individual-rational. 

What about joint responsibility! Agents with joint responsibility are sup- 
posed to be equally rewarded or blamed for the actions they execute as a collec- 
tive. 

Example 3. Imagine a football team playing a erueial mateh. Eaeh player will 
receive a medal if the team wins or will be fired if they lose. Suppose that they 
have never played before, so that they agree on a set of deals, the first consisting 
of eleven actions meaning the usual distribution of tasks ( goalkeeper, defender, 
sweeper, striker, etc...) and five different empty sets for the substitutes. Suppose 
now that the striker, number 9, is sent off, so that agents must apply the rene- 
gotiation rule and execute the second plan according to the new circumstances. 
Eor example, the wings must center their positions in the pitch and try to score. 
Then, if the team wins, should the number 9 be awarded? And, if they lose, should 
he be fired? Should the substitutes be awarded or blamed? The intuitive answer 
to these questions is ‘^yes^\ Now, we have a model that explains why: as agents 
agreed on the other deals as part of the ^^generaF deal, everyone in the team is 
responsible for the outcome. 

7 Conclusions 

In this paper we have presented a model of coordination in which agents first 
recognise how they depend on each other and then exchange offers and coun- 
teroffers until they reach an agreement. Agreements can be executed following 
a strict mechanism, or can involve the use of a renegotiation rule that provides 
different deals. In this last case, agents agree on a set of deals, so that deals are 
executed in a given order according to the changing conditions an the ability of 
the agents. Using this mechanism we have illustrated how notions linked to CPS, 
such as help and joint responsibility are explained without mentioning any so- 
cial attitude. Everything is settled in the terms of the agreements. Therefore, we 
have concluded that there is no point in adopting different mechanisms of coor- 
dination for teams or societies. Any collective will follow the same coordination 
mechanism, regardless of whether agents share the same goal or not. 

There are several issues to be addressed in future work, the most obvious of 
which is the need for refinement of the model. Moreover, the model should cope 
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better with uncertainty: Agents can have incomplete knowledge and different 
points of view, so argumentation turns to be an essential part in coordination. 
Finally, the tolerant mechanism works well in games in which the rules are well- 
known, but real-life social interactions are usually far more complicated than a 
football match. The study of multiple-encounters and roles will hopefully allow 
us to identify and characterise the constant environmental factors required to 
find equilibria between efficiency and stability. 
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Abstract. Agents (in AI and DAI) are founded upon theories related to mental 
states and to the notion of architecture. However, there is still no consensus, or 
sufficient knowledge, to formulate a satisfactory theory which would define 
mental states, relating them to architectures and agent behaviour. The paper is 
located in this context and presents a theory in which a space of mental states 
is built up on types of mental states which are defined from a set of basic 
attributes which are: an External Content (a declaration about a situation in the 
world); criterions to determine the unsatisfaction, uncertainty, urgency, 
insistence, intensity and importance associated to a mental state; laws of 
causality through which a mental state can produce another; and mechanisms 
for provoking, selecting, suspending and interrupting the processing of a 
mental state. 

A space for possible agents’ architectures is built up on that mental states 
space. So, from these two spaces, the paper presents a methodology to define 
and compare agents' architectures and to understand and produce the 
corresponding agents behaviour. In addition, it is shown that this methodology 
is suitable for agent programming based on the Object Oriented Programming 
paradigm. 

1. Introduction 

There is still no consensus in regard to the entire list or the best choice of mental 
states. Over the last decade, several sub-lists have been proposed, which however, 
suffered restrictions either because of specific features of a problem domain or by the 
limitations of the chosen logical apparatus. There is also a current feeling of 
uneasiness related to the state of art concerning mental states in DAI. The definition 
of what an agent is remains not clear in view of disparate opinions [14]. Several 
attempts and models have been made to clarify these issues [4], in which dynamic 
aspects of reasoning based on the revision of beliefs and motivational attitudes are 
modeled in a compositional manner. Recently, Sloman[1996] advanced the theory of 
a table of mental states but without specifying how this is to be carried out. He argued 
that mental mechanisms are best considered in a global architecture context. We are 
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convinced after Correa and Coelho[1993] and Correa[1994] that a complete 
functioning agent needs a speeification based on the notion of architeeture, but 
highlighted by the dynamies of mental reactions. The way all mental states move 
around and interact is carefully controlled, in a similar way to what oecurs now with 
artificial chemical reactions [11]. The present contribution is a further step to clarify 
our holistic approach to mentality, that is the integration among mind, architecture 
and behaviour. 

Our starting point is the observation that mental states (MS) as Beliefs, Desires and 
Intentions, are eommonly defined as being struetured on basic components (the same 
oecurred with Mendeleev's periodic table of the elements in 1869). Cohen and 
Levesque[1990] defined Intention as being structured on “Choice” plus 
“Commitment”, in other works "Commitment” is also thought as more basic than 
Intention [5], [20]. Wemer[I99I] treated Intentions as Strategies, and Bratman [1990] 
defined them as Plans. The components of Belief usually are a Proposition and a 
"true" or "false" value associated to it, or a degree of Certainty and can also have a 
degree of Importance [18]. 

On the other hand, Sloman[1990, 1996] pointed out that urgency, importance, 
intensity and insistence are behind agent’s actions. 

We argue that other mental states than Belief, Desires and Intentions ean be neeessary 
to understand and to explain the eomplexities of agents behaviour and a complete and 
consistent theory about agents’ mental states has not only to explain and build the 
artificial agents but also to understand the natural ones[13]. 

For instanee. Expectation is a mental state which enables more flexibility and more 
complex behaviours [17], [26], [9]. Expectations and Intentions can complement one 
another. Intentions can embody an agent’s expeetations that the agent act in ways to 
satisfy those intentions, while expeetations ean lead an agent to form intentions to 
check that those expectations are satisfied. The Expeetation ean be defined in terms of 
basic attributes as an External Content ("Romeo expects that Juliet will come back ") , 
uncertainty ("Romeo expects that Juliet will come back, but he is not sure "), urgency 
("Romeo expects that Juliet will come back urgently "), importance (’It is very 
important for Romeo that Juliet will come back"), intensity ("Romeo is intensively 
expeeting that Juliet eome baek"), insistenee ("Romeo eontinuously expeets that Juliet 
will come back"), unsatisfaction ("Juliet still does not come back as Romeo 
expeeted"), control ("Juliet still does not eome back, and Romeo cannot wait anymore 
so, he has to do something "). 

Our approach considers mental states as organizations of agent’s internal processing 
of information related to their actions, and a fundamental feature of these 
organizations is that they are related to situations in the world (intentionality). Mental 
states are guides for agent’s actions such that the agent behaviour ean be explained on 
them and, on the other hand, mental states can interact to produce the agent 
behaviour. We assume that there is a limited set of basie attributes such that a mental 
state is defined in terms of some eombination of them. 
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In Correa and Coelho[1997] it is proposed a framework to standardize the eommonly 
notion of usual mental states as Beliefs, Desires and Intentions and other not so mueh 
used in DAI as Expeetations. It is shown that from this framework the researeh and 
applieation to build agents with new other possible types of mental states (as Hopes) 
ean be possible. 

In the paper it is shown that this framework ean be applied to agent programming 
based on the Objeet Oriented Programming paradigm. Our aim is to establish a 
methodology: 1) The mental states are eharaeterized in terms of a small set of basie 
attributes; 2) we argue that these attributes are, at least, suffieient to define mental 
states; 3) other possible mental states eould also be eharaeterized by assigning them a 
set of those attributes; 4) This mental states model ean explain agents’ interaetions; 5) 
An Agent Oriented Programming results from this mental states model. 

2. A Framework for Mental States 

In order to have a theoretieal structure to define mental states we need to find an 
agreement about their basic components or attributes. This can be obtained by 
observing the conceptions and applications of the usual mental states and filtering a 
common set of them. On the other hand, these attributes must be put together and 
analised if they are capable to offer a base to define the usual mental states of DAI 
(Belief, Desires and Intentions) and other not well applied yet, although known from 
Psychology, Philosophy and Economics as relevant to explain human behaviour as 
Expectations, Hopes and Necessities. When we diversify applications and work 
within Social and Human Sciences we are forced to adopt such other mental states. 

As a matter of fact assigning attributes to mental states can be validated according to 
the theories of Psychology, Cognitive Science and Philosophy [22], [23], [9], [13]. 
They must enable MS to explain agent actions and they must also allow, from the 
point of view of agent's engineering, the building of agents through the definition of 
architectures based on mental states. From these basic attributes it should be possible 
to understand the relationships among mental states and their dynamics behind agents' 
behaviours. 

To make references to these attributes we organize them in three groups: the first 
called "nucleus" contains the mental state's external content and a set of criterions for 
unsatisfaction, uncertainty, urgency, intensity, insistence and intensity; the second 
called "laws" contains a set of possible causal relationships among mental states and 
the last called "controls" contains a set of controls to trigger, choose, suspend and 
cancel mental states. 

2.1 Nucleus 

These attributes define proper characteristics of MS that is, the MS with the same set 
of attributes are classified as the same type as shown in the paragraphs below, but 
every particular MS is distinguished, at least, by its external content. 
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Normally an MS is distinguished from other MS of same type by the attributes of the 
nueleus. 



- External Content (Ex. Content) 

The mental states have external signifieanee; that is to say, they are linked to the 
world in terms of what they are about. This external eontent in terms of the Situation 
Theory formalism is a proposition: "situation s supports infons ei, e2...”[2], [12]. 

- Unsatisfaction 

This attribute is related to the stimulation of aetions or ehanges of mental states. As 
Russel[1971] pointed out the unsatisfaetion is the first motor of aetion. Thus, we 
eonsider that this eomponent is the eentral motivator of aetions and the produeer of 
other mental states. 

The degree of unsatisfaetion is a funetion Sp(t) from a proposition p and time t to the 
set of real positive numbers. A faetor £s (Ps P 0) is neeessary as a limit to deeisions 
regarding agent MS unsatisfaetion. An MS is satisfied at time t if Sp(t) < £s, 
otherwise MS is unsatisfied. So, we need some eriteria (or proeedure) to deeide if a 
MS is satisfied or not. 

- Uncertainty 

It is a measure of agent eonfidenee regarding the situation that eorresponds to the 
mental state. 

The degree of uneertainty is a funetion ep(t) from a proposition p and time t to the set 
of real positive numbers. A faetor £c (£c P 0) is neeessary as a limit to deeisions 
related agent MS uneertainty. For example, ep(t) ean be a probability assoeiated to a 
proposition p. 

- Urgency 

It is a measure of how mueh time remains to the point the eorresponding MS must be 
satisfied. 

The degree of urgeney, eorresponding to a mental state X at time t, is a funetion Ux(t) 
from a time t to the set of real positive numbers. A faetor £u (£u F 0) is neeessary as a 
limit to deeisions related to MS urgeneies. 

- Intensity 

It is related to the agent’s pledge and energy dedieated to an MS. For example, if an 
agent is trying to satisfy an MS, the intensity of this MS is eonneeted to how aetively 
or vigorously this satisfaetion is pressed and adopted. 

The degree of intensity, eorresponding to a mental state X at time t, is a funetion Vx(t) 
from a time t to the set of real positive numbers. A faetor £y (£v F 0) is neeessary as a 
limit to deeisions related to MS intensities. 
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- Importance 

The importance is related to a valuation in terms of benefits and costs the agent has of 
a corresponding mental state situation. 

The degree of importance, corresponding to a mental state X at time t, is a function 
mx(t) from a time t to the set of real positive numbers. A factor Cm (^m £ 0) is 
necessary as a limit to decisions related to MS importance. 



- Insistence 

This attribute is related to how much dificult it is for an agent to abandon a MS. For 
instance, if the agent strongly insists on some goal, this goal will not be abandoned 
easily . 

The degree of insistence, corresponding to a mental state X at time t, is a function 
nx(t) from a time t to the set of real positive numbers. A factor £n (^n £ 0) is 
necessary as a limit to decisions related to agent MS insistences. 

2.2 Laws 

The laws define how the mental states are combined to cause other mental states or 
agent’s actions. For example, under certain conditions, a Belief and a Desire cause 
another Desire: an agent A’s Desire to learn mathematics and the Belief that agent B 
knows mathematics and that B teaches A, cause A's Desire to learn mathematics from 
B. Another law is: a Desire and a Belief can cause an Intention: A’s Desire to learn 
mathematics from B and A’s Belief that in order to learn from agent B, there is a 
strategy as: 1) ask B if he/she wants to teach mathematics to A, 2) if so, A accepts B’s 
instructions regarding mathematics. That is, A must know how to learn from another 
agent. Thus, there is an A’s Intention to learn mathematics from B. 

A collection of laws relating Belief, Desire, Intention and Expectation is presented in 
figure 1 according to Correa [1994]. Another demonstration of such mental states 
dynamics applied in a Tutor/Leamer session is shown in Moussale et al. [1996]. 

2.3 Controls 

These attributes define how and when an MS causes another, can be suspended, 
canceled or stayed active. An MS is active when it produces or influences another 
MS, causes an agent action or contributes indirectly to an agent action. An MS is 
suspended when it is temporarily inactive. A MS is cancelled when it is permanently 
inactive. An embrionary idea of the relationships among behaviour and goal- 
terminating and interrupting mechanisms as these controls was also introduced by 
Simon [1967], although he didn’t use the concept of mental states in that paper. 
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Let X be a mental state and p the eorresponding situation. The possible laws will be 
triggered if Sp (t) £ and at least one of the eonditions (Cl to Cl) below oeeurs: 

Cl) u^ (t) £ £^ and (t) £ £y and rn^ (t) £ £j^ 

C2) ux (t) e £u and mx (t) e 
C3) Ux (t) £ £|i and Vx (t) £ £y 
C4) Vx (t) £ £y and mx (t) £ £m 
C5) mx(t)££ni 
C6) Vx (t) £ £x 
C7) Ux(t) £ £u 

C8) Confliets’ solution for ehoosing, suspending and eaneeling eonflieting mental 
states. For instanee: 

If X and Y are aetive and eonflieting MS then 
suspend X if gx(t) < gy(t) and gx(t) £ £c 
eancel X if gx(t) < gy(t) and gx(t) < £c 
suspend Y if gyt) < gx(t) and gy(t) £ £c 
eaneel Y if gy(t) < gx(t) and gy(t) < £c 

Where gx(t) and gy(t) are the interruption fimetions, defined in terms of urgeney, 
intensity or importanee of the eorresponding MS, and £c is a real positive number to 
be used as a limit to deeide if a MS will be suspended or eaneelled. 

We will not present here a speeifie definition of eonflieting mental states. We 
eonsider that two MS of the same type are eonflieting when they eannot eoexist at the 
same time in the agent mind. 

C9) Aetivation of a suspended mental state. 

If X is a suspended MS and there is no other aetive MS eonflieting with X then X is 
aetivated, unless if there is an aetive MS Y eonflieting with X then X is aetivated if 
nx(t) £ £n, X is maintained suspended if gx(t) £ £c or X is eaneeled. 

CIO) Finding a strategy to satisfy an MS. 

If there is no strategy or means to satisfy a MS X then if “it is possible to find a 
strategy or means to satisfy X” then “find and adopt this strategy” else if 

gx(t) £ £c X is suspended otherwise X is eaneeled. 



Cl 1) Find alternatives when an adopted strategy doesn’t work anymore. 
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If K is an adopted strategy to satisfy an MS X and, and at some moment, it is not 
possible to satisfy X through K then find another strategy if possible and iix(t) £ 
£n 5 suspend X if gx(t) £ £c or eaneels X. 

The eontrols Cl to C7 shown above aet as trigger mechanisms, so that an MS X alone 
(or combined with another) will produce other mental states through its possible laws 
assigned in the table. The controls C8 and C9 act as filters to activate or suspend a 
MS. CIO and Cll govern MS directly depending on agents’ actions. These controls 
(C8 to Cll) are a meta-strategy that corresponds to the so-called "commitment” of a 
mental state [6], [5], [20]. 

The relationships among these attributes and six mental states. Beliefs (B), Desires 
(D), Intentions (I), Expectations (E), Hopes (H), Necessities (N) and Perceptions (P), 
are shown in the table of figure 1 . 

The attributes assigned to Beliefs, Desires and Intentions in the table, are according to 
the DAI specialized literature and also to Philosophy. Expectations are not 
extensively used in DAI and we conjecture that they have the attributes as shown in 
the table. We include also Hopes and Necessities as a test of our theory, because they 
were not considered in DAI until now. So, we suggest that these two MS could have 
the attributes assigned to them as in the table, but this is an open question requiring 
further discussion. 

In figure 1 the types of mental states are indicated in the columns of the table and 
their corresponding attributes (marked with an x) are indicated in the rows. The group 
of rows labeled from El to L9 corresponds to the laws relating the mental states 
among them, that is, let W some MS indicated in a column, so if in this column a row 
labeled by ”E1) B <=" is marked then a Belief can be caused by MS W (B <= W), the 
row labeled ”E2) B <= B +" means that a Belief can be caused by another belief and 
W (B <= B + W) and similarly, "L3) D <=" means D <= W, ”L4) D <= B+" means D 
<= D + W etc... The row labeled ”L10) A <=" means that the mental state W can 
produce directly some action A of the agent (A <= W). The correspondence among 
the mental states and the controls Cl, C2,...is also shown in the table. For example, 
the Beliefs (B) are defined by assigning them the atributes Ex. Content, Uncertainty, 
Insistence, Importance, El, L2, L7, L8, L9, C8, and C9. 

In the paper we are considering only individual mental states, however the notions of 
social mental states are also necessary for more precise and complete modeling of 
agents’ interactions in a society [15], [25], [24], [7]. We are also developing these 
social notions according to our theory, but we don’t have space to present these 
advances here. 

3. The Space of Mental States and Agent Architectures 



The table of figure 1 presents types of mental states relating them to a set of basic 
ttributes. The attributes selected define the type of mental state. In a more formal 
manner, type Z of a mental state is defined by a triple <Nz ? Lz? Cz >, where Nz is a 
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Figure 1: Table of relationships among mental states Beliefs (B), Desires (D), 
Intentions (I), Expeetations(E), Hopes (H), Neeessities(N), Pereeptions(P) and 
their possible attributes. 



set of nucleus attributes excepting the External Content, is a set of laws and Cz is a 
set of controls. A mental state X of type Z is a triple <Sx, z, Fx>, where Sx is the 
External Content of X, Z is the triple defined above and Ex is a set of couples <fix , 
£ix>, such that fjx is a function defined from time t to a set of real positive numbers, 
£ix is a real positive number and each^ouple Fix (Fix ^x) corresponds to an 
attribute (Niz Nz). We denote by Z a triple <Sx, Z, Fx> (a mental state X of 
type Z). 
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Let M represent the mental states spaee defined by the set of possible triples <Sx, Z, 
Fx>. An agent A arehiteeture is a sub-set of mental states M space (A p M). The 
space of possible agent architectures is M , the power set of M. 

12 si 

F^r exar^lCj tl^ BDI architectures can be considered as sets {B , B B , D , 
D D , I , I }, where B is the Belief type, D is the Desire type and I the 
Intention type. This means that an implementation of that architecture would have all 
the attributes corresponding to Beliefs, Desires and Intentions. 

The architecture (called SEM) presented in Correa and Coelho [1993] and Correa 
[1994] still includes the Expectation type (E) so, this architecture is defined by sets 
{b\ B®, T>\ r I V, e'^}. It wai implemented on 

the basis of architectures {B , B B^}, {D D , D }, {I I and {E 

E ,...,^E^ }(called sub-ugents), such that {B , B ,..., B^} ~ {D , D ,..., D^}~{I , 
I ,...,! } ~ {E , E ,..., E } is the agent architecture (global agent). 

All that suggests a methodology for constructing agents’ architectures by 
implementing sub-agents in such a way their architectures are defined by the sets 
{Z]^} (k = 1,..., n; i = 1,..., n]^) and the global agent architecture is {Z\ }-,... 
,{Zk}~...~{Zn} (i=l,...,mk). 

One of the advantages of defining the space of mental states and the space of 
architectures closely related is that when we are speaking about the agent's mental 
states we can also see which the processes and the flow of information envolved in 
these processes inside the architecture are. These processes are implicated in the 
mental states attributes. On the other hand, by seeing the processes and the flow of 
information inside an agent's architecture we can guess which mental states can be 
attributed to the agent. 

4. Agent Programming 

This methodology to build agents, to be useful, should not only be a guide to classify 
and make comparison among mental states agents' architectures since it provides a 
space to define them, but it should also be a guide to program agents. The approach of 
mental states as defined from the basic attributes shown in table of figure 1 is very 
adequate to map mental states to Classes and Objects of Object Oriented 
Programming (OOP) as shown below. Our experiments with agent programming 
according to the theory of the paper are currently in progress. In these testbeds the 
types of mental states are programmed as Classes in an OOP language as the Java 
programming language [1] and, a particular mental state is a particular object of the 
corresponding class. So, to program an agent we need to define their mental states as 
objects as sketched below: 
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Desire desirel = new Desire(Extenal_Content_D1, eSD1, eUD1, eVD1, eND1, eMD1); 

Desire desire2 = new Desire(Extenal_Content_D2, eSD2, eUD2, eVD2, eND2, 
eMD2): 

Belief belieft = new Belief(External_Content_B1, eCB1, eMB1); 

Belief belief2 = new Belief(External_Content_B2, eCB2, eMB2); 

That is, desirel, desire2 are objeets of elass Desire and beliefl, beliefZ are objects of 
class Belief The Extemal_Content (Dl, D2, Bl, and B2) are Propositions. So, there 
are classes defined also to support the constructs of Situation Theory (infons, 
situations, inferences with situations etc...). The other parameters correspond to the 
mental states thresholds (see figure 1). 

A programming sketch of the mental states desires from the framework of figure 1 
can be: 

import Proposition; 
public class Desire extends Thread { 
public Situation External_Content; 
public float eS, eU, eV, eN, eM; 

public Desire(Proposition External_Content, float eS, float eU, float eV, float eN, 
float eM) { 

this.External_Content = External_Content; 

this.eS = eS; this.eU = eU; this.eV = eV; this.eN = eN; this.eM = eM; 

} 

public void run() { // the object desire runs concurrently with other mental states 
while (unsatisfactionO > eS & insistence() > eN) 

{// controls to trigger corresponding laws 
if ( urgencyO > eU & importance() > eM & intensity() > eV) 

// trigger a law to produce another mental state 
triggerDesiresLaw(External_Content, eS, eU, eV, eN, eM); 

} 

} 

public float unsatisfactionO { //this is a method for a desires' unsatisfaction} 
public float urgency(){// this is a method for a desires' urgency) 
public float insistenceO { // this is a method for a desires' insistence } 
public float importanceO { // this is a method for a desires' importance) 
public float intensityO { // this is a method for a desires' intensity) 

public void triggerDesiresLaw(Situation EC, float eS, float eU, float eV, float eN, 
float eM) 

{ // this is a method to trigger the desire corresponding laws to produce other 
mental states) 

}// End of Desire 

We have no space and it is not our intention to present here a complete nor a 
consistent programming of the mental state framework. We only wish to show in a 
very sketched way how the structure presented in figure 1 can be directly and easily 
described in a object programming language as Java. It is important to note that a 
desire object is implemented as a Java thread, that is, it runs concurrently with other 
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mental states whieh, on the other hand, is also a thread. The other mental states ean 
also be programmed similarly. So, the eoncurrent computing is a basis for this 
programming framework. 

5. Conclusion 

This article proposes a theory for defining two spaces, one for mental states and 
another for architectures. The mental states spaces may include more mental states, 
not only to analyse their validity but also their dynamics and influence on the agent 
behaviour. 

Our theory also allows for the definition of the space of possible architectures in 
terms of mental state spaces. This is really an advantage because it offers tools for 
engineering and programming agents (in a Object Oriented Programming paradigm) 
and a methodology for the experimentation and the evaluation of the relations among 
mental states, architectures and agents behaviour. 

So, modeling the actions of an agent and looking inside their mental sites (sequence 
of mental events) makes the design and the programming of agents easier. 

References 

[I] K. Arnold and J. Gosling. The Java Programming Language, The Java Series, Sun 
Microsystems, 1996. 

[2 Perry, 1983] J. Barwise and J. Perry. Situation and Attitudes, A Bradford Book, The MIT 
Press, 1983. 

[3] Michael Bratman. What is Intentions? In P. R. Cohen, J.L. Morgan and M. Pollack (eds.). 
Intentions in Communication, The MIT Press, Cambridge, MA, 1990. 

[4] F. Brazier, B. Dunin-Keplicz, J. Treur, R. Verbrugge. Modeling Internal Dynamic 
Behaviour of BDI Agents, Proceedings of MODELAGE Workshop, January, 1997. 

[5] Cristiano Castelfranchi. Commitments: From Individual to Groups and Organizations, 
Proceedings of The First International Conference on Multi-Agent Systems (ICMAS-95), 
1995. 

[6] P. Cohen and H. Levesque. Intention is Choice with Commitment, Artificial Intelligence, 
42:213-261, 1990. 

[7] P. Cohen, H. Levesque and I. Smith. On Team Formation. In Contemporary Action Theory, 
G. Hintikka and R. Tuomela (editors), Kluvier Academic Pub, 1997. 

[8] Milton Correa and Helder Coelho. Around the Architectural Agent Approach to Model 
Conversations, Proceedings of Modeling Autonomous Agents in a Multi- Agent World 
(MAAMAW-93), Neuchatel, Switzerland, Springer- Verlag, 1993. 

[9] Milton Correa. The Architecture of Dialogs of Distributed Cognitive Agents, Ph. D. Thesis 
(in Portuguese), Federal University of Rio de Janeiro, January, 1994. 

[10] Milton Correa and Helder Coelho. A Framework for Mental States and Agent 
Architectures. In Multi-Agents Theory and Architectures Conference, MASTA97, Coimbra, 
1997. 

[I I] R. Deeth. Chemical Choreography. New Scientis, July 5th, 1997. 




From Mental States and Arehiteetures to Agents’ Programming 75 



[12] K. Devlin. Logie and Information, Cambridge University Press, 1991. 

[13] , Esther Frankel and Milton Correa. A Cognitive approaeh to Body-Psyehotherapy. The 
Journal of Biosynthesis, Vol 26 W 1, Abbotisbury Publieations, April, 1995. 

[14] S. Franklin and A. Graesser. Is it an agent, or just a program? a taxonomy for autonomous 
agents, Memphis University, Working Report, Mareh, 1996. 

[15] N. Jennings. Controlling eooperative problem solving in industrial multi-agents systems 
using joint intentions. Artifieial Intelligenee 75, 1995. 

[16] Neila Moussale, Rosa Vieeari and Milton Correa. Tutor-Student Interaetion Modeling in 
an Agent Arehiteeture Based on Mental States, Brazilian Symposium on AI (SBIA-96), 
Springer- Verlag, 1996 . 

[17] , I. Pom. Aetion Theory and Soeial Scienee. Some Formal Models, Reidel Publishing 
Company, Dordeeht-Holland, 1974. 

[18] Milton Rokeach. Beliefs, Attitudes and Values, Jossey-Bass Ine, Pub., 1970. 

[19] Bertrand Russel. The Analysis of Mind, George Allen&Unwin Ltd, 1971. 

[20] Yohav Shoham. Agent-Oriented Programming, Artifieial Intelligenee, 60, 1993. 

[21] Herbert Simon. Motivational and Emotional Controls of Cognition, Psyehologieal 
Review, Vol. 74, N" 1, 1967. 

[22] Aaron Sloman. Motives Meehanisms and Emotions, In M.A.Boden (ed.) The Philosophy 
of Artifieial Intelligenee, Oxford Readings in Philosophy Series, Oxford University Press, 
1990. 

[23] Aaron Sloman. What Sort of Arehiteeture is Required for a Human-like Agent? Invited 
talk at Cognitive Modeling Workshop, AAAI96, Portland Oregon, August, 1996. 

[24] Milind Tambe. Teamwork in Real-world, Dynamie Environments. International 
Conferenee on Multi-agents Systems, 1996. 

[25] R. Tuomela. The Importanee of Us. Stanford University Press, 1995. 

[26] Bonnie Webber, N. Badler, B. Eugenio, C. Geib, L. Levison, M. Moore. Instmetions, 
Intentions and Expeetations, University of Pennsylvania, Computer and Information 
Department, June, 1993. 

[27] Erie Werner. A Unified View of Information, Intention and Ability, In Deeentralized 
Artifieial Intelligenee, Y. Demazeau and J. P. Mueller (eds.), Elsevier Seienee Pub., 1991. 



Acknowledgment 

PROJECTO PRAXIS XXI SARA 2/2.1/TIT/1662/95 




Genetic Integration 
in a Multiagent System 
for Job-Shop Scheduling 



Thierry Galinho^, Alain Garden^, and Jean-Philippe Vacher^ 

^ PSI-LIRINSA 
Insa de Rouen 

Place Emile Blondel, F-76130 Mont-Saint-Aignan 
{Thierry . Galinho , Jean-Philippe . Vacher }@insa-rouen . f r 
2 LIP6 Paris VI 
UPMC, Case 69 

4, Place Jussieu, F-75252 Paris Cedex 05 

Alain . Cardon@lip6 . f r 



Abstract. The Job-Shop scheduling problem constitutes a typical NP- 
Difficult problem. Determining an optimal solution is almost impossi- 
ble, but trying to improve an existent solution is the way to lead to a 
tasks repartition which is better. We use Multi- Agents Systems (M.A.S.). 
These simulate the behavior of entities that are going to collaborate to 
accomplish actions on a Gantt diagram with the intention to better re- 
solve a given economic function. Communications between global and 
local agents, components of the MAS, due to their actions, manage the 
appearance of agents of intermediate granularity and the global opti- 
mization in production scheduling. To have micro and meta-agents, a 
Multi- Objective Genetic algorithm is used, and especially on account 
of an ideal solution of such a problem which is a point where each ob- 
jective function corresponds to the best (minimum) possible value. The 
genetic autonomy and the notion of motivation for an agent may lead to 
a drastically new kind of emergence phenomenon (different social behav- 
ior, auto-referring evaluation process, ...) in self-organizing multiagent 
systems. It is certainly a difficult task but it may set the seeds of a pro- 
lific approach concerning artificial life to optimize a Job-Shop Scheduling 
Problem. 



1 Introduction 

The Job-Shop scheduling problem constitutes a typical NP-Difficult problem. It 
becomes then impossible to determine an optimal solution (correspondent to a 
determined criterion) in a reasonable time. It invites therefore to use random 
techniques and or heuristics to determine a good quality solution. This solution 
being able to be an optimal solution. Nevertheless, seen the important number 
of possible solutions, it is very difficult that we determined rather an optimum 
place. The graphic representation of a solution is made under the form of a Gantt 
diagram. 

Helder Coelho (Ed.): IBERAMIA’98, LNAI 1484, pp. 76-87, 1998. 
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The problem consists to improve this representation so as to find a best 
therefore a new solution corresponding to a new optimum, by constraints sat- 
isfying that we have fixed. To do that, we employ a multiagent system (MAS) 
in order that agents that constitute it, according to the knowledge that they 
have been able to acquire [11], make evolve the Gantt diagram to a good solu- 
tion. We present in this communication the multiagent systems in production’s 
scheduling. Then, we will specify the model that we have used and finally the 
notion of granularity of agents coming from the agent’s knowledge and from the 
reproduction between agents. 

2 MAS in Scheduling 

2.1 Representation of the Gantt Diagram 

The representation level that we have fixed, is the operation. A job being con- 
stituted of several operations, it is necessary to minimize the maximum delay 
of all jobs [15] [13]. According to a solution, good or bad, the Gantt diagram 
presents some characteristics as the presence of holes, operations of similarly 
nature dispersed on the graph, etc [10]. It represents the graphic interpretation 
of the scheduling of all jobs (See Fig. 1). The objective is to determine a best so- 
lution by minimizing the presence of holes (times die), by regrouping operations 
of similarly nature so as to reduce times of cleaning and preserie by taking into 
account the matric of costs. General manner, it is necessary to improve the Gantt 
diagram of horizontal manner. But the alone horizontal vision is not significant, 
even by respecting constraints. It is important to improve the Gantt diagram 
according to specificities, zones, that can be released, correspondent to similar 
characteristics. It appears therefore zones having special characteristics (zone of 
inactivity, ...). Therefore, agents must act on these zones so as to improve the 
Gantt diagram. 

2.2 Objects Manipulated 

As we have said it previously, the representation level that we have chosen is 
the operation. Therefore, objects that we manipulate will be operations. By 
extension, agents will act on groups of operations (or zones). Previously, we 
have introduced the notion of zones. A zone does not make reference necessarily 
in physical operations, the former can be potential as a hole that can correspond 
to a potential task of not activity. 

Consequently, the goal of agents composing the MAS is to operate roundups of 
operations, in the respect of precedence constraints to the level of the Gantt 
diagram. 

3 Objective of the MAS on the Gantt Diagram 

By definition, MAS represent a subset emerging of the Artificial Intelligence that 
tend to put in evidence the two following principles : 



78 



Thierry Galinho et al. 




Fig. 1. A Gantt diagram representation 



1. The complex system construction employing agent multiple, 

2. Mechanisms for the coordination of independent agent behaviors. 

Nevertheless, this definition is not generally accepted in AI, for purposes con- 
tained in our article, we consider an agent as being an entity with objectives, 
actions to accomplish and areas of knowledge, which is situated in its environ- 
ment. 

However, the ability to consider the coordination of the autonomous agent be- 
havior is a new way among fields of the Distributed Artificial Intelligence (DAI) . 
Therefore, because of the knowledge of agents, rules of actions, ..., the MAS will 
have for principal objective to group agents having similar behaviors to elabo- 
rate strategies to the jobs level, jobs of jobs, machines, machines of machines, 
etc. Thus, it appears the notion of group. The objective of the MAS is to 
improve the Gantt diagram, therefore it invites to establish the notion of group 
corresponding to elementary entities having common grinds and physical same- 
ness (same capacity of machine, etc.) or interdependence. 

We will use the notion of zone for the roundup of entities on the Gantt diagram 
while we will speak about the notion of group for the roundup of entities simi- 
larly or close nature. 

Agents have to intervene on groups and elementary entities, the MAS will be 
then composed with micro and met a- agents. It is therefore important, for this 
evolution, to introduce agents having a character : the meta-agents of evolution. 
These meta-agents will have therefore as function to make evolve this organi- 
zation by means a Genetic Algorithm establishing the sexued reproduction of 
agents . It is necessary to note that, traditionally, agents have as unique pos- 
sibility only the cloning. But here, we use Genetic Algorithm for the physical 
evolution of agents. It appears therefore, in the course of the evolution, different 
sizes of agents: we will speak about agent’s granularity. We have therefore mi- 
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cro and the meta-agents that are going to intervene, according to their size, on 
an entity or a group, by passing by intermediate levels. Thus, agents having a 
meta-knowledge are going be able to intervene on the macro-entities (group) as 
well as on some zones of the Gantt diagram. It appears therefore a distributed 
agent system being able to mutate and cross between them. 

4 The Multidimesion Transformation of a Gantt Diagram 

The representation of a planning under the form of a Gantt diagram in two 
dimensions does not allow to define the global characteristics or general of the 
former. By the former, we hearing the notion of quality, the respect of the master 
plan, etc. An operation, constituent of a job, makes emerge only the characteris- 
tics places (delay, advance, ...). Nevertheless, these local characters do not make 
show local information, nevertheless capital, aiming to obtain from global evalu- 
ations from predictions (they as global) of the commercial service. The roundup 
of operations by family, by taking account a macro- nomemclature, to which are 
made correspond the macro - programs for the realization of tasks correspond- 
ing to a family, is not visible. General manner, given the nature NP -Difficult of 
organizations, we plan to improve a Gantt diagram, according to an economic 
function. In order that, it is necessary us to transform the evaluated Gantt in a 
dynamic system whose: 

1. views to the level place, correspondent to a structure such that the share, 
the operation, ... and 

2. views to the level global, correspondent, they also to a structure (Gantt, 
etc). 

These views are in interaction unite them by report to others by the dynamic 
function intermediary, which correspond to the evolution of an organization. 
The idea is to end to an organization of manipulated elements. This organi- 
zation is in tension, that is to say that some elements ’’react” by putting in 
obviousness the fact that they not contribute to an improvement of the Gantt 
diagram. The goal of our system is ”to slacken” it, to end to a global improve- 
ment of the Gantt without arrive to a ’’rupture”. By rupture, we hear the fact 
that the Gantt no longer goes from the whole to satisfy production needs and 
to take account master plans of the enterprise. To make this, we plan to realize 
a colored transformation of the Gantt diagram to obtain a multi -dimensional 
representation of the former. We have chosen as formalism the decomposition of 
the specter of the white light to represent the n-uplet Advance — Delay — Pri- 
ority. Thus, by playing on nuances, we can represent all possible cases n-uplet 
(See Fig. 2). However, some characteristics are not visible but have inevitably to 
appear in the coding (family of product, etc) so as to realize the crosscheck to 
determine strategies to implement for multiagent systems, composed of global 
agents (of trends) and local agents (operation on one or a group of tasks) , could 
improve globally the Gantt diagram, while satisfying to respect of the popula- 
tion. Thereby, the radiation of an operation corresponds to its ability to make 
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show information clearly: activity to realize, time of cycle, machine to use, etc. 
A 2-D Gantt diagram represents a discreet environment because of the pres- 
ence of holes, consequently, it is not possible to consider as continuous space. 
Topologies, generally defined, make reference to a continuous environment. Our 
objective is to show that connexities ordinarily defined in a discreet space are 
equivalent to the connexities of a continuous space. That will allow us to define 
continuous totalities from totality discreet. A general manner, that returns to 
make a decomposition of . 

From the transformation of the Gantt diagram in discreet multi-dimension im- 
ages, we can regroup identical information between them. By transforming our 
Gannt diagram in multi-dimension image, we can have a more global vision of 
the system on which agents will be able to intervene. 




Fig. 2. Multidimension representation of a Gantt diagram 



4.1 Mutation and Crossing Functions 

Traditionally, agents of the MAS can only reproduce by cloning. Out here, we 
introduce the notion of sexued reproduction. It is entirely possible to cross several 
agents so as to give some new. To make this, we use a genetic algorithm. 

The mutation will correspond to the change of a bit (1) [12], thus, we can use 
switchboard operators. 

1 — ^ OandO — ^ 1 (1) 

Our constraint, at the mutation level, corresponds to have a correspondence be- 
tween the bits string and the database. Thus, by changing the value of one bit, 
we can introduce a new character. This will have a repercussion on the envi- 
ronment, but especially on its membership to a group. The communications it 
has been able to have with other elements of the group will be, incontestably, 
changed. For example, consider that the mutation introduces a certain aggres- 
siveness at the agent level, then communications with the group are going to 
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change and the group, consequently, will probably loose some of its social cohe- 
sion. Therefore, in order to avoid the too abrupt upset of the social balance that 
can exist between individuals composing a group and the organization itself, the 
mutation interventions by genetic algorithms will need to be weak. Neverthe- 
less, we can consider that at the beginning, the simulation of the organization, 
as at the beginning of a civilization, progress was rapid enough. Therefore, at 
the beginning, we can introduce an important number of mutations. We will 
use as distribution (2), for the number of mutation by generation, a curve of 
parameters (a, /3) [1]. 

/ : X — ^ with j3 G IR^ and a G IR^* (2) 

Thus, by using this type of distribution, we introduce a lot of mutations at the 
beginning of the simulation and few at the end in order to avoid the breaking 
of the process of evolution by deeply modifying characteristics of chromosomes, 
therefore of individuals. 

Too many mutation in the systems would inexorably set the seeds of chaos. We 
have previously seen a possible distribution. Nevertheless, by using a Gaussian 
distribution to determine the probability of mutation, we preserve the switch- 
board concerning the Genetic Algorithm. 

Also, mutation will not be done by a mathematical function at the end of the 
developpement but a meta-agent can be used to stoped the process if the system 
didn’t give good results. 

5 Description of the Genetic Algorithm Used 

A MAS simulates the behavior of entities composing a social organization. The 
GA, as for it, simulates the biological evolution process, in the senses of Darwin, 
of entities composing an organization. That is to say that less effective individuals 
are going to disappear to the profit of individuals better adapted. It appears 
therefore an elitist system, style ” law of the strongest ” that is established. 

The genetic algorithm used resumes the great lines described in works of 
D. Goldberg [7] and J. Holland [8,9]. Nevertheless, a Simple Genetic Algorithm 
(SGA) makes intervene only one function of performance. Here, we use the cri- 
terion of optimization of Pareto to establish our fitness function. We have a 
multicriterion optimization function [4]. The result is a multi-objectif genetic 
algorithm [6]. 

The ideal solution of such a problem is a point where each objective function 
corresponds to the best (minimum) possible value. The ideal solution, in most 
cases, does not exist because of the contradictory nature, rather contradictory 
objective functions: compromises have to be done. A different concept of opti- 
mality has to be introduced. Solving a multiobjective problem generally requires 
the identification of Pareto optimal solutions [14], a concept introduced by V. 
Pareto, a prominent Italian economist, at the end of the last century. A solu- 
tion is said Pareto optimal, or non dominated, if starting from that point in the 
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design space, the value of any of the objective functions cannot be improved 
without deteriorating at least one of the others. 



5.1 Direct Multiobjective Problem’s Solution 

Directly solving the multiobjective problem has the advantage of finding a rep- 
resentative subset of the Pareto front in one shot; on the other hand, not many 
efficient methods exist which are capable of this approach. A Genetic Algorithm 
using the dominance criteria to drive the evolution of the population is one 
of these methods. The characterizing feature of the multi-objective GA is thus 
the introduction of the Pareto criteria in the method used for individuals se- 
lection [14]; by selecting individuals in the reproduction phase according to the 
domination criteria, a set of non dominated solutions can be developed. These 
are all possible alternative solutions to the problem, which meet the require- 
ments at different level of compromise, and that approximate the Pareto front of 
the problem. In this way, the arbitrary choice regarding the weights to attribute 
to the different design criteria is avoided. 



6 Going Deeply in the Relationships of GA and MAS 

The use of GA in MAS is the beginning of what can be an interesting research 
area. There are clearly two kind of approaches, the first is centralized, in other 
words, some of the genetic is outside the agent. The function of selection is a 
good example of such out-of-the-agent feature [18] [16] [17]. 

However, we believe that if one wants to completely merge the genetic approach 
and MAS (See Fig. 3), we must make the agent a completely autonomous ge- 
netic entity. By that we mean that not only the genetic patrimony has to be 
’’onboard” but also the functions of selection and crossing. An agent must choose 
which other agent it wants to reproduce with [17]. The location of the function 
of mutation is not clearly known since it is caused by the possible exposure to 
external events coming from the environment and during the genetic code repli- 
cation phase. 

If we also introduce the notion of motivated behavior for agents [2] we go 
deeply in the artificial life problematics. The genetic autonomy and the notion 
of motivation for an agent may lead to a drastically new kind of emergence 
phenomenon (different social behavior, auto-referring evaluation process, ...) in 
self-organizing multi- agent systems. It is certainly a difficult task but it may set 
the seeds of a prolific approach concerning artificial life. 

7 Agent Modelisation 

The goal of our problem is to improve a Gantt diagram corresponding to a 
Job-Shop Scheduling Problem. To do this, we use a simplified version of a real 
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Fig. 3. MAS and GA in the environment 



description of a workshop which have been defined in the system named FISIAS 
(Fast Interactive System for Intelligent Automated Scheduling). This develop- 
ment, made for an industrial structure, was not adapted to our case obviously. 
We have chosen to modify or to remove some elements of the model. 

Generally speaking, the set of treatments refering to a schedule in quantity is 
not used here. On the same manner, we work with a finite loading. Therefore, 
we keep informations coming from the infinite loading as indicators, gauges for 
our system. In our development, we use finite loading but we use some results 
coming from MRP (infinite loading). At the level of the workshop, we use the 
linear time and not the calendar time. 



7.1 Possible Roundups 

Within workshop, we have some equivalent machines. To lead to an agents’ rep- 
resentation, we must have roundups of entities. This entities will correspond to 
an “agent” for the future development of our problem. These entities can be an 
equivalent machine, a job, an operation, ... 

Thus, coming from an independent set of equivalent machines, we made, accord- 
ing to criteria, roundups to obtain machines’ group. After, these-ones will be 
divided in sub-groups according to the output, to the matrix cleaning time, to 
the loading/bottleneck, ... We made on the same manner for jobs, operations, ... 
In our model, a job can be characterised by its linear routing, the fact that it 
can be interrupted, divided and by dates (due date, availability, criticality, jam- 
ming, ...). An operation is characterised by the process time, ... 

As we have groups of machines, we also have groups of jobs, operations, ..., and, 
each group have a set of sub-groups. We have sub-group by output, by matrix 
of cleaning/pre-series, etc. Therefore, how obtain coherent sub-groups? 

To obtain coherent sub-groups, we use the refine technic. To define sub-groups. 
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we coming from the biggest group and we refine these sub-groups to specialize 
them. At the end of the process, we lead a static representation. 

At the level of the dynamic representation corresponding to the quality appear- 
ance, we use informations coming from the schedule with infinite loading as well 
as these coming from quality criteria. Thus, we obtain groups by using ressouces 
from the workshop and from the material. 



7.2 Scheduling System by Agents 



As we have groups of machines, we have also groups of jobs, operations, etc. 
Some agents have to accomplish actions on a part of the Gantt diagram or on a 
group. Our agents must evolve to do revelant actions and manage/modify /create 
specific groups. Our system do not use the scheduling itself but use the dynamic 
of our system. We don’t put jobs, operations to relise on machines directly. This 
is agents that intervene to put operations on the Gantt diagram. These placing 
will be made according to timeliness, possibilities of agents. 

As we can see on the figure 4, the scheduling is realised by agents. Therefore, 
as the scheduling evolve during the process, our system wants to have agents 
that evolve during the scheduling process. This evolution is done by a genetic 
feedback. 




Fig. 4. Dynamic represensation by agents for our problem 
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8 Agent Representation 

8.1 View of an Agent 

Coming from the representation and the definition of an agent, which have been 
done by J. Ferber, we can have a simplified view of this-one. 

An agent can be represented as follow: 

— actions’ function to lead to groups of agents. 

— heuristics - strategies. 

— homeostatic parameters to lead to behaviour’s rules. 

— a communication network 

Therefore, an agent simulate the behaviour of a leaving organism. Thus, we can 
said that an agent have three statements: the peace, to intendto do something 
and the action (See Fig. 5). 




Fig. 5. The three statement of an agent 



Data ^ Multi-Agent System 




Fig. 6. Data representation between MAS and Gantt diagram 



9 Conclusion 

We have seen in this communication that a Multiagent System can be used with 
a Genetic Algorithm to minimize the delay of a Job-Shop Scheduling Problem. 
Determining an optimal solution is almost impossible, but trying to improve an 
existent solution is the way to lead to a tasks repartition which is better. There- 
fore, we use have used Multi- Agents Systems. 
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According to a solution, good or bad, the Gantt diagram presents some charac- 
teristics as the presence of holes, operations of similarly nature dispersed on the 
graph, etc. The MAS simulate the behavior of entities that are going to collabo- 
rate to accomplish actions on the Gantt with view to better resolving the given 
economic function. Therefore, we haven’t one econmic function, it’s why we use 
a Multi- Objective Genetic Algorithm to find an ideal solution. This solution is a 
point where each objective function corresponds to the best (minimum) possible 
value. 

During the simulation process, agents granularity appears with the mutation be- 
havior introduce by GA. At the end of the simulation, communications between 
global and local agents, due to their actions, manage the appearance of agents of 
intermediate granularity and the global optimization in production scheduling. 
This communication refiects the genetic integration in a multiagent system; the 
first results show that a MAS and a multi-objective GA are a way to optimize a 
Gantt diagram of a Job-Shop Scheduling Problem. 
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Abstract. Many systems have been already developed concerning agent 
teams in a world with obstacles. One of the problems of such systems 
lies on how to maintain a pre-defined formation when we have several 
agents moving in the world. In this paper we defend that, in order to have 
a robust and realistic system, a control model that includes the notions 
of mass and acceleration must be used. To prove that, we developed a 
control system based on the classic mechanical physics, which is a force- 
based model. From the results obtained we can see that, although some 
problems arise when using such realistic kind of model, they are solvable 
and the quality of the simulations performed by the system is signifi- 
cantly better than the simulations obtained using other control models. 

Keywords: Multi- Agent Systems, Multi- Agent Formation 



1 Introduction 

Amongst several known problems in the field of mobile robotics, the most usu- 
ally discussed is how to allow the robot to move autonomously in an unknown 
environment. This problem becomes worse when we consider not only a single 
robot, but a team of them given that a robot must consider not only the ter- 
rain topology, but also the positions and movements of the remaining robots, 
to prevent collisions and to stay out of their way. To handle such problem we 
must therefore establish a group behavior that allows the robots to perform in 
the right way. 

Recently, these problems have been the object of study of the area of soft- 
ware agents [4] and multi-agent systems. One of the distinguishing features of 
this area is the fact that each agent is responsible for determining its own actions 
(resorting only to it’s knowledge of the world, given by the sensors), not being 
controlled by any external process. This tries to mimic what we can find in na- 
ture, where no higher intelligence determines the individual actions. Rather, the 
overall behavior of the society of agents emerges from the individual decisions. 
The software agent’s approach seems to be not only more adequate, but also 
more correct to deal with this problem. 

Helder Coelho (Ed.): IBERAMIA’98, LNAI 1484, pp. 88-100, 1998. 

© Springer- Verlag Berlin Heidelberg 1998 
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In this area, several systems have already been developed. The most famous 
of these is undoubtedly, the BOIDS system, created by Craig Reynolds [3]. Based 
on the same approach other systems have been developed, such as, for example 
the system by Hodgins and Brogan [2] that simulates a herd of pogo-stick-like 
robots in a tri-dimensional world. One aspect of these systems is that they do 
not maintain any kind of formation or differentiate the agents and specify their 
desired positions in relation to each other. However, formations are important 
since they allow the team to use its sensory assets in a more efficient way than if 
the team was arranged randomly [1]. This paper focuses exactly in the formation 
control of a team of robots. 

Moreover, we want for the formation to remain robust enough in the presence 
of unpredicted obstacles. One system that achieves these goals was created by 
Balch and Arkin [1]. It is based on a small number of robots able to maintain a 
pre-determined formation while moving towards a goal, even in the presence of 
obstacles. However, some limitations can be found on their approach, in partic- 
ular its lack of realism. 

In this paper we defend that such kind of system should have a more realistic 
dynamic model, thus, including the notions of mass and acceleration. The work 
here described introduces these notions through the use of a formation control 
system based on the classic mechanical physics, thus, a force-based model. We 
will show the problems that arise when using that realistic kind of model, and 
how we solved them. 

This paper is organised as follows. In the next section, we will discuss the 
problem domain and the entities involved in it. Next, we will rapidly describe 
the main characteristics of Bach and Arkin’s system [1], and explain the reasons 
behind the need for some improvements. Then, we will show what makes our 
system different from the existing ones and the results we achieved with it. 
Finally we will discuss the results and point out the conclusions our work led to. 

2 The Problem Domain 

The system created simulates a society where several agents can move in a world 
with obstacles, whilst trying to reach a goal position. Since this system can be 
seen as an extension of the original work by Balch and Arkin [1] we introduce 
the basic concepts involved very briefly. 



2.1 The Formations 

In order to make possible the definition of a formation, it is necessary to dis- 
tinguish an agent from its companions. Thus, we attribute an unique number 
to each agent. A very large number of formations are, of course, possible. There 
are, however, four standard formations in military domains, depicted in Fig. 1: 

Line: the robots travel side-to-side. 

Column: the robots travel behind each other. 
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Fig. 1. Possible formations 



Diamond: the position of the robots is that of the vertexes of a diamond. 
Wedge: the robots are positioned in a ”V” shape. 

These formations, besides allowing the verification of the validity of the sys- 
tem in that domain, are rich enough to test a great amount of situations in 
simulation terms, each one having its own problems, as we will see below. 

2.2 Reference Methods 

There are three classical reference methods: in relation to a given robot , in 
relation to a leader^ and in relation to the formation's center of mass. The first 
method is similar to the one used in BOIDS [3], with the difference that in BOIDS 
the position of a bird was dependent on those of its neighbors, which could be 
any other birds. In here, the robot from which the position is determined is given 
from the start. The definition of the desired position of a robot in relation to its 
reference point is given in terms of two values: an angle and a distance. 

The angle must be measured not in relation to the horizontal, but in relation 
to the perpendicular of the movement of the reference point. This will allow the 
formation to maintain itself when not moving straightforward. 

It’s velocity vector, in the case of a single robot gives the direction of the 
movement of the reference point. In the case the reference point is the center of 
mass of the formation, the average of the individual velocity vectors is used. 

2.3 The World 

The world is a bi-dimensional square with arbitrary size, where several obstacles 
(columns with a given radius) can be found. 




Fig. 2. Relative reference 
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Table 1. Motor Schemas 



Motor schemas 


Usefulness 


Avoid-Static-Obstacle 


Avoid collisions with static obstacles 


Avoid-Robot 


Avoid collisions with other robots 


Move- To- Goal 


Drive the robots towards the goals 


Noise 


Noise 


Maintain-Formation 


Force the robots to maintain the formation 



Apart from the obstacles, the only other important points in the world are 
the goals. A goal is a point that represents the place the robots must go in order 
to fulfil their mission. In the case there are several goals in the world, the robots 
must get to them in a pre-specified order. The robots are considered to have 
achieved a goal when the center of mass of the formation is less than a given 
number of units from the goal. 



2.4 The Mot or- Schemas 

The formation behaviors of the robots were implemented as motor-schemas (Ta- 
ble 1). These schemas are similar to the ones used by Balch and Arkin. 

From each schema we have a vector (usually a force) , ranging from zero to a 
preset maximum. All the resulting vectors are added, taking into account their 
relative gain. The resulting vector that determines, at each instant, how will the 
robot’s movement be altered. 

Avoid-Static-Obstacle and Avoid-Robot: these schemas’ functions are 
used to prevent collisions with an obstacle or with other robots. The re- 
sulting vector will be in line with the line that joins the robot and the 
obstacle and the direction that will keep them both apart. The intensity of 
the vector will depend on two values: a minimum and a maximum range. If 
the distance between the robot and the obstacle is grater that the maximum 
range, then the intensity of the vector will be zero. If it is smaller than the 
minimum range, it will be the maximum permitted intensity. Otherwise, it 
will oscillate between those two values. 

Move-To-Goal: this schema is the responsible to make a robot move in direc- 
tion of it’s objective. Its result is simply a vector with the maximum intensity 
and the direction of the goal. 

Noise: to make the simulation more realistic, this schema results in a vector 
with random direction and intensity, thus introducing noise in the system. 
Maintain-Formation: this is, perhaps, the most important schema, since it 
allows the robots to position themselves in the desired formation positions. 
First, the desired position is determined, in relation to the established ref- 
erence point. Then according to the distance of the robot to that point, the 
intensity of the vector with its direction will vary. 
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Like in the avoidance schemas, two values are considered: a minimum and a 
maximum radius. In this case, however, if the distance to the desired point 
is greater than the maximum range, the intensity of the vector will be the 
greatest allowed. The desired point is then said to be in the ballistic zone^\ 
If the distance is smaller than the minimum radius, the intensity of the vector 
will be zero {^^dead zone^^). Otherwise, the desired point is in the controlled 
zone^^ and the vector’s intensity is proportional to the distance. 



2.5 Limitations of the Existing Approach 

The mechanisms we described until now were used in system developed by Balch 
and Arkin [1]. However, there are some limitations to their work. Firstly, the 
reference method that gives the desired position of a given robot in relation 
to that of another robot was not implemented. Secondly, and by far the most 
important omission of this work is that the simulation model is based only on 
the velocity, and not acceleration. This means that the vectors returned by the 
motor-schemas are, in fact, an indication of the desired velocity in the next 
instant. This makes the simulation very unrealistic, since it assumes there can 
exist infinite accelerations, that can radically change the speed vector of a body 
instantly. It also simplifies greatly the control problems, making the task of 
maintaining the formation a very easy one. 

These limitations motivated the creation of our system, which will be de- 
scribed and evaluated in the following sections. 

3 A Realistic Simulation Model 

To overcome the limitations presented, a new system was created. Mainly, we 
have improved the existing system on three factors: new types of obstacles, new 
vector intensity decay models and a dynamic force-based control model. 

3.1 New Types of Obstacles 

In the real world, it is very unrealistic to assume that all the obstacles an agent 
will face are fixed in nature. Not only that but it is plausible that the agent would 
react differently when faced with different obstacles. It might, for instance, start 
to deviate from larger ones first (since they are circular in nature, they will 
occupy a greater area to avoid, requiring a larger deviation). 

So, in the first place, we assumed our agents have some kind of ’’improved 
sensor” that can not only detect what is the distance to the nearer obstacles, 
but also the curvature radius of them. Our avoid- static- obstacle motor-schema 
will take this information into consideration, and the resulting vector will be 
proportional to the size of the obstacles. 

Besides this change to the static obstacles, we also introduced another kind 
of obstacles: the mobile obstacles. These are other agents that, unlike the robots, 
are limited to roam the landscape in a pre-determined or random way, thus 
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hampering the voyage of the robots. We created these obstacles in order to 
account for the multitude of unexpected features the robots might encounter on 
a real situation (people, animals, and so on). To deal with these obstacles, a 
different schema, avoid- moving -obstacle^ was introduced. 

3.2 Vector Intensity Decay Models 

When determining what will be the intensity of the vector returned by a motor- 
schema, two radiuses are normally used. In the case of the avoid- static- obstacle 
scheme, for instance, when the distance between the obstacle and the robot is 
superior to a given radius, the intensity will be zero. If it is smaller than a certain 
value, the intensity will be the maximum allowed. Between those two values, the 
intensity will vary. In the original system, the variation of the intensity of that 
vector was linear. In our model, we have implemented a quadratic decay. 

We found that often, this aspect added an ’’urgency feeling” to the robots, 
as the repulsion (or attraction, depending on the schema) will increase greatly 
in extreme situations, but will be moderate otherwise. Since we used an accel- 
eration model, if, by any chance the robots get into one of those extremes, the 
response must be swift. In other cases, we do not want the robot to use very 
large accelerations since it would make the task of controlling it more difficult. 

3.3 The Dynamic Model 

In our system, we have implemented a control model where the vectors resulting 
from the motor-schemas are, in fact, force vectors. We call this control model 
the acceleration model. According to that mass of the robot, we generate an 
acceleration vector (up to a given maximum), that we will use to alter the 
movement of the robot at the next instant of time. 

We also introduced attrition into the system. This was done by defining 
an attrition constant that generates a force with a direction opposite to that of 
movement in each time step. This makes it very difficult for the robots to saturate 
the system by getting to the maximum speed. This speed is still possible, but 
only temporarily and due to a very large acceleration which, as we have seen, 
occurs in extreme cases, using the geometric decay model. Thus, a robot falling 
behind it’s desired position will, it the situation gets too bad, give an ’’extra 
push” that will be enough to compensate temporarily for the attrition and get 
in formation. 

Likewise, even when travelling a long time in a straight line, the speed will 
not be at it’s maximum, thus allowing for a correct change of direction. 

All these problems, while relevant when simulating a real environment, do 
not occur in the velocity model (where the output of the schemas is a velocity 
vector, rather than a force). 

3.4 Problems Introduced by the New Dynamic Model 

The force-based dynamic model brought some new problems to the system. In 
the following paragraphs, we will briefly discuss them and analyze possible ways 
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to solve them. This was done by carefully tuning a series of parameters that 
define how the system will behave. 

Global Constants: the value of the maximum acceleration had to be chosen 
very carefully. A small value would not be enough to permit the robots 
to react quickly enough to a sudden need of change in direction (such as 
when taking a curve). Likewise, a very large maximum speed would result 
in a similar behaviour. Thus, it was necessary to choose these two values 
correctly in relation to each other, in order to have accelerations large enough 
to account for the speed of the robots, but not large enough to make them 
loose control at the slightest change in movement. This was related to the 
attrition constant that we chose (that ended up being 0.1). 

Maintain- For mat ion Parameters: although this might not be evident, to 
make the force returned by this schema large in relation to that of the others 
(or to the maximum acceleration), might have ill effects. 

For example, if when getting out of formation, a robot heads with great 
speed towards the desired position, once it reaches that position, it would 
not be able do decelerate fast enough to stay there. So, we will witness an 
oscillatory behaviour, where the robot will first move very fast towards the 
position in one direction, and then in the other (Fig. 3). 

Move-To-Goal Parameters: the move-to-goal schema might not be so easy 
to define as it would seem. If the force produced by this schema is very large, 
the robots will try to head towards the goal no matter what lies in their path. 
This will hamper greatly the robots’ capability to avoid obstacles. 

Also, if the acceleration produced by this schema is very high, it will be 
difficult for the robots to change direction once they have reached a goal, 
towards another one. So, we must be very careful while choosing the value 
of the gain for thus schema. 

Maintain-Formation vs. Move-to-Goal: another important trade-off was 
between the maintain-formation and the move-to-goal schemas. A situa- 
tion that usually arises is that the robots, when travelling in a straight line 
towards a goal, tend to narrow the formation. 

This occurs because the forces produced by the move-to-goal schema all point 
to the same place, and do not take into account that the actual ’’goal” to 
each robot is different according to it’s position in the formation. Because the 



A 

A 



A 






A 






Fig. 3. Oscillatory behaviour 
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Table 2. Default parameters 



Parameter 


Value 


Parameter 


Value 


Avoid-Static-Obstacle 

Gain 


3 


Avoid- Moving- Obstacle 
Gain 


1.5 


Maximum Range 


50 


Maximum Range 


50 


Minimum Range 


5 


Minimum Range 


5 


Avoid-Robot 

Gain 


2.5 


Maintain-Formation 

Gain 


4 


Maximum Range 


10 


Maximum Range 


40 


Minimum Range 


5 


Minimum Range 


3 


Move- To- Goal 
Gain 


3 


Noise 

Gain 


0.05 



robots will travel for some time along a straight line, they will have time to 
greatly increase their speed towards the goal. So, if the maintain-formation 
force is not large enough, it will not suffice to avoid the narrowing effect. 

Avoidance Schemas: we must take care with the forces resulting from these 
schemas, in order to allow the robots to attain a goal that lies behind an 
obstacle (if the repulsion is too large, they will never be able to pass beyond 
that point). We can adjust these schemas in two ways: with a large intensity 
on a small radius and vice-versa. With a small radius, the influence of the 
schema will be felt independently by each robot (since some of them might 
be inside the radius and others not). With a large radius, the influence of 
the schema is felt more uniformly on the entire formation. 

Finally, the avoid-robots schema must only yield a force in extreme situa- 
tions, since, otherwise, we risk losing control with the antagonism between 
this schema and maintain-formation. 



4 Tests 

We conducted some tests with the same velocity model as Balch and Arkin [1] 
and the results we got were similar. To perform the tests with the acceleration 
model, and taking into account the previous considerations, some values for the 
parameters were chosen (see Table 2). 

The robots’ maximum velocity was limited to 30 spatial units/time unit and 
they were given the mass of one. The robots’ dimension varies in unrelated tests. 
Towards the objective of making the avoid-rohot schema an emergency force, we 
gave it a gain of 2.5, an influence sphere of 15 and a minimum radius of 5. 

In all tests, the world is similar, with a dimension of 1000x1000 spatial units 
and no mobile obstacles (to permit comparing the tests). The sample period 
was 0.1 time units and the value of noise gain was 10% of the maximum accel- 
eration. 
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4.1 Statistical Evaluation 

The main evaluation of the system is qualitative. Despite that, we tried to add 
some objective evaluation and some more concrete manner to compare the per- 
formance of the system with altered parameters or in different situations. There- 
fore we considered: 

Path Ratio. This value is calculated dividing the robots’ average traveled space 
by the distance between the various goals. This gives a notion of the deviation 
of the robots to the correct path. Notice that this ratio has a minimum of 1, 
and is normal in the simulations to have values substantial greater than this 
limit - some of the robots, in order to maintain- formation^ have to perform 
same kind of external path regarding the line between goals; 

Formation Error. Is the sum of the position error of each of the robots towards 
is ideal position in every instant of the simulation. This contains information 
about the formation maintenance; 

Average Formation Error. Is the average of the above value; 

Simulation Time. Length of the simulation, in time units or number of sam- 
ples. 

When defining a robot’s position in the formation in relation to another 
robot it is important, in order to achieve good results, to consider carefully what 
robots should be related. The immediate option, defining one robot in relation 
to is nearest neighbor is not always the best solution. 



4.2 Results Obtained 

1. Column Formation A formation with 50 spatial units between the robots 
and a robot dimension of 10 was used. When using the definition of the position 
in relation to center of mass, the behavior, in qualitative terms, of the formation 
is good. The average formation error is low (36). Note that a very small change 
in the orientation of the reference point, either a robot or, in this particular case, 
the formation’s center of mass, causes an enormous variation of the ideal position 
of the robot. The path ratio is around 1.07. This is explained by the tendency 
in performing the curves external to the rectilinear path between goals. 

In case of formation relative to a leader, the error values are considerably 
larger (91.0 in the average and 421.2 to the maximum). This is because in this 
case the reference point (the leader) has quicker oscillations than the center of 
mass, introducing a non real error measure (the maximum error happen when 
in the end of a rectilinear path the leader suddenly change is orientation). 

When facing an obstacles field, the robots in this formation, independently 
of it’s definition, tend to follow the same path, like a snake. 



2. Line Formation A formation with 50 spatial units between the robots and 
a robot dimension of 8 was used. 
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Table 3. Test results of the different formations 





1 Column 


1 Line 


1 Diamond 


1 Wedge 




Leader 


CoM 


Relative 


CoM 


Relative 


Leader 


CoM 


Relative 


CoM 


Path Ratio 


1.08 


1.07 


1.10 


1.11 


1.14 


1.08 


1.07 


1.08 


1.06 


Formation Error 
Minimum 


17.41 


6.43 


1 


2 


10 


0 


1 


13 


2 


Average 


91 


36.3 


583.3 


152.3 


200 


166 


72 


341 


225 


Maximum 


421.2 


106.4 


1790 


427.2 


570 


450 


251 


836 


839 


Std. Dev. 


70.6 


25.7 


435.5 


96.7 


121 


124 


50 


214 


181 


Time 


2581 


1765 


2198 


3086 


3087 


2648 


2693 


2839 


3216 



This formation, when faced with a obstacle, produces a position error larger 
than the column formation. Take in account that in order to avoid an obstacle 
a robot must change it’s movement perpendicularly to its actual direction. 

In the case of a relative formation, the gain of the maintain-formation schema 
must be very small (0.4) to prevent an increasing oscillation, that, in ultimately 
makes the robots go around each other. We didn’t find a set of value parameters 
that made the performance of this kind formation definition acceptable. 

As an important note on the behavior of this formation is the narrowing that 
it suffers when approaching a goal. 



3. Diamond and Wedge Formations In the diamond formation, the distance 
between the robots and the center of mass was fixed to 75 units, and the robots 
dimension to 10. The increase of this distance forced a relaxation of the goal 
achieving condition. Thus, the desired distance to a goal was incremented to 40. 

The best results, with the formation in relation to a leader or in relation to 
the center of mass, were achieved with an avoid obstacles schema concentrated 
and powerful (with a gain of 9 gain and a radius of 20). On the other hand, in 




Fig. 4. Diamond formation defined in relation to a leader 
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Fig. 5. Ideal avoid obstacle force. 



formation relative to another robot, the best results were obtained with a long 
range schema and low gain (100 and 2 respectively). 

Note that, in the formation defined in relation to a leader, if the leader robot 
turns, the others maybe go back a little or change their path in order to maintain 
the formation. Because the leader has no concern about the formation, he has a 
tendency to get to a goal faster than the others, and then have to wait for them. 

On the Wedge formation, the results of the tests are similar. 

5 Discussion 

The results we found lead to some questions, both empirical and theoretical 
which will be discussed bellow. 

5.1 Schemas Parameters 

In all the cases, we encountered that a low radius but a strong gain of the 
avoid-robots schema produces the best results. In fact, this emergency force, in 
a trivial situation, is unnecessary: the maintain-formation schema will produce 
the desired repulsive force when the robots are approaching each other. 

When concentrated, the avoid obstacle schemas produce a behavior where 
each robot travels along the obstacles’ borders. Some times there is a transfor- 
mation in this normal, and observed, behavior and the robots try to avoid the 
obstacles in an oscillatory manner - approaching and retreating quickly. 

A particular situation occurs when a robot is following a path that crosses 
an obstacle’s center. The force generated by this schema will only slow down the 
robot’s velocity, but it will not induce a shift strong enough to make the robot 
contour the obstacle. Also, this force is maintained when the robot passes the 
obstacle, now with the form of a positive acceleration, like if the robot is running 
away from the obstacle. 

One improvement to this schema, would be to make the force have an orien- 
tation somewhat different from the current. In the case of Fig. 5, for example, 
the ideal force was perpendicular to the movement. 

One last observation: this schema’s gain must be substantially larger than 
the move-to-goal schema gain, in order to prevent a deadlock situation. 

When a robot achieves a goal it continues to suffer the effects of move-to-goal 
schema, until all the formation achieves that goal. This causes, particularly in 
leader referenced formations, where the leader is in the front, that the robot 
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keeps an oscillatory movement around the goal. It’s suggested as future work to 
keep the force of this schema as zero in the neighborhood of the goal. 

This last schema is also responsible for the narrowing, and slow reaction in 
emergency situations. To restrain its influence to the essential, we suggest that 
this schema’s gain should be proportionally inverse to the velocity of the robot. 



5.2 Formation Definitions 

In every formation, the narrowing increases proportionally to the space between 
robots. This introduces large error values. The relative defined formations work 
better when the maintain-formation schema uses a larger radius to limit the 
controlled zone. This type of formation has big error values, even if the formation 
is well formed (look at the minimum error values for these formations). 

In a leader- defined formation, non-leader robots should have zero gain in the 
move-to-goal schema, thus, they should be limited to following the leader. 



5.3 Statistical Evaluation 

The formation error used is an ambiguous statistic, and should be used with 
great caution. For example, it can be a good method to compare different tests 
with the same formation, but it’s not so good between different formation defi- 
nitions. In order to quantify the oscillation that robots suffer, the energy spent 
in a simulation must be considered in a future work. 



6 Conclusions 

As shown by the performed tests, by using the new simulation model more re- 
alistic results are obtained. Indeed, the old model based only in the velocity of 
the agents is not only unrealistic, but also much more sensitive to noise, thus 
introducing more oscillations in the robots. It also, requires a careful-tuning of 
the schemas to prevent unlimited velocities. Aiming to create a system that can 
eventually be used in a real-world environment, a velocity model is evidently 
inadequate. When presented to different situations (different formations, for in- 
stance) it doesn’t have the robustness we would desire, needing to be tuned 
differently for each case. 

The acceleration model we have presented in this paper brings new advan- 
tages to the simulation but it also introduces some complexity in the control 
system. It is not evident what schemas should be considered. Also, the tuning 
of the several parameters is not evident. However, once it is tuned, it performs 
correctly in a wide range of situations as shown by the results presented. 
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Abstract. This paper presents the experimental TRYSA 2 distributed 
deeision support system, that has been designed for the management of 
the urban motorway network around Bareelona. It shows how different 
teehnologies in the area of intelligent agents ean be eombined to solve 
this real-world deeision support problem: on the one hand, a pre- 
established distribution of the loei of deeision-making and the nature of 
loeal traffie management tasks, suggested to apply “deliberate” prob- 
lem-solving agents; on the other, the eomplexity of the co-ordination 
task called for an “emergent” approach, in order that the overall deci- 
sion support functionality be the result of non-benevolent agent interac- 
tions. The paper sets out from a description of our particular traffic 
management problem. Subsequently, the architecture of TRYSA 2 is 
outlined, pointing out the design strategy followed and describing how 
the different design steps have been realised. Finally, we discuss the 
lessons learnt from building this multiagent application. 



1. Introduction 

The problem of coherent distributed decision-making is intrinsic to many complex 
real-world situations, where the behaviour of a complex dynamic system is regulated 
by human operators, who can perform particular control actions on different parts of 
the system. In the frame of the management of a computer network, for instance, 
different local administrators are responsible for reconfiguring certain sub-networks 
to improve aspects of local and global network performance; another example is traf- 
fic management in a road network, where an operator decides upon the sets of traffic 
signals to be set in different parts of the network, so as to overcome local traffic 
problems. In general, local decisions respecting control actions for one part of the 
system affect the effectiveness of others. In consequence, it is necessary to achieve 
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certain level of eo-ordination to obtain a eoherent set of loeal eontrol deeisions. In 
such a scenario, a distributed decision support system (DSS) is a software tool that 
assists operators in their deeision-making, by automatieally monitoring a dynamie 
system, warning about present or future undesired situations and suggesting appropri- 
ate coherent control actions to operators [5]. 

The nature of distributed deeision support often “calls for” an architecture based on 
multiple cognitive agents: the structure of the agent society reflects the internal 
strueture of the system to be controlled, thus eoping with potential eommunieation or 
privacy requirements, while the knowledge-based (i.e. “cognitive”) approach allows 
to explicitly model the operators’ expertise, so that they can understand the system’s 
adviee as well as the reasons that justify it. Still, the design of distributed DSSs of this 
type turns out to be a rather difficult task for two major reasons: first, co-ordination 
among the different loci of decision-making (and the eorresponding support agents) 
beeomes significantly harder when the size of the systems grows, faet whieh is espe- 
cially important for real-world applications [9,12]; second, despite recent work in this 
direetion [7,8,10], the laek of adequate design strategies and methodologies gives the 
design of sueh systems some flavour of a “blaek arf ’. 

This paper describes how these difficulties have been attacked in a real-world case: 
it presents the arehitecture of a distributed DSS in the domain of urban traffie man- 
agement as an example of a practieal applieation of decentralised multiagent technol- 
ogy. First, our particular traffic management problem in the urban motorway network 
of Barcelona is outlined. Then, the TRYSA 2 traffie management system is presented, 
deseribing our design strategy, the agent models that this strategy gave rise to, and the 
ProsA 2 agent architecture that operationalises these agent models. Finally, we discuss 
the lessons learnt from building this decentralised multiagent applieation. 




Figure 1. A traffic management infrastrueture 
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2. The Problem: Urban Traffic Control 

In big cities, traffic control centres are in charge of managing urban transport, so as to 
maintain and restore the “smooth” flow of vehicles. In Barcelona, the local traffic 
control centre JPT is in charge of this job: traffic engineers continuously receive in- 
formation about the traffic state, identify potential problems, and act upon control 
devices to overcome them. It has become particularly difficult for the JPT engineers 
to perform this job in real time, as in the follow-up of the 1992 Olympic Games the 
traffic management infrastructure in Barcelona has become increasingly complex. 
Nowadays, information about the traffic state of the urban motorway network, con- 
sisting of one ring-road and seven adjacent motorways, is provided by over 300 tele- 
metered sensors (“loop detectors”) via fibre optics communication links. Control 
actions can be taken by means of 52 Variable Message Signals (VMS), 3 traffic lights 
for junction control, as well as by ramp metering on 7 ring-road drives. Figure 1 il- 
lustrates typical elements of this traffic management infrastructure. 

The TRYS system has been developed to provide real-time decision support for JPT 
traffic controllers [3]. In line with the traffic engineer’s logical subdivision of the road 
network into problem areas, TRYS relies on a set of knowledge-based traffic control 
agents, each responsible for traffic management in one such area. On the basis of 
sensor data, operator notifications and contextual information, each agent generates 
proposals of signal plans for control devices. Potential conflicts between agents 
(problem area usually overlap!) are resolved by a special co-ordinator agent, which 
receives control plans from the traffic control agents and harmonises them, so as to 
obtain globally consistent signal plans. These signal plans are presented to the opera- 
tor who finally decides to enact or to modify them. TRYS has been installed and is 
being evaluated at the Barcelona test site [3]. 

Although this architecture has shown to perform well, it also suffers difficulties in 
scalability. The complexity of the co-ordination task grows exponentially in the size 
of traffic control agents and, in addition, it becomes increasingly complex to elicit co- 
ordination knowledge from traffic control experts. In this paper we tackle the problem 
of how to eliminate the co-ordinator agent, by augmenting the degree of local auton- 
omy of the traffic control agents and having the functionality of the co-ordinator 
emerge as a consequence interaction between neighbouring agents (see Figure 2) 




Figure 2. Centralised (TRYS) y decentralised co-ordination (TRYSA 2 ) 
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3. The TRYSA 2 System 

In this section we present the TRYSA 2 system (TRYS Autonomous Agents), in which 
autonomous traffic control agents co-ordinate their signal plans in a decentralised 
fashion. TRYSA 2 augments local problem-solving capacities of TRYS agents by a 
model of self-interested pursuit of (local) goals, thus converting the original “be- 
nevolent” TRYS agents into autonomous agents. So, in the follow-up of the problem 
area decomposition of the TRYS approach, the TRYSA 2 experimental system consists 
of 11 traffic control agents that jointly manage the traffic in the motorway network 
around Barcelona (Figure 3). 

In the sequel we first outline the design approach that we followed when develop- 
ing the TRYSA 2 system. Subsequently, its application to the design of TRYSA 2 
agents is outlined. Finally, we sketch the implementation and operation of the system. 




Figure 3. Autonomous traffic agents for Barcelona 



3.1 The design process 

Problem-solving and co-ordination in TRYSA 2 rely on the mechanism of structural 
co-operation [13,14], that has been developed to achieve co-ordination within socie- 
ties of autonomous problem-solving agents. Within structural co-operation, the func- 
tionality of an agent system is determined by the agents’ local goals, the dependence 
relations [16] that their environment implies, as well as normative prescriptions [2]: 
norms bias agent interactions, which influence self-interested agent behaviour so as to 
make it instrumental with respect to a global functionality [12]. In accordance with 
this mechanism, multiagent system design is a three stage process. 

1. Individual stage: design of local problem-solving. 










104 Sascha Ossowski et al. 



Agents are endowed with a “motivation” on the basis of whieh they generate indi- 
vidual goals in response to certain undesired situations; they are provided with 
problem-solving mechanisms that enable them to achieve (or: “head towards”) 
these goals. No reference to potential interference with other agents is made. The 
traffic control agents of the original TRYS system are precisely of this type. 

2. Social stage: modelling of autonomous agent interaction. 

The possible interdependencies between the agent’s problem-solving actions are 
determined, giving rise to the multiple possibilities of conflict and synergy between 
them. Self-interested agent behaviour is to be modelled on this basis (e.g. when to 
ask others for help and when to grant or deny help [15]), and its impact on society 
level is determined. In the TRYSA 2 system we use a model from bargaining theory 
(the Nash solution), accept it as the outcome of self-interested agent interaction, 
and perform distributed search for the corresponding solution. 

3. Normative stage: design of a functional normative bias. 

The “equilibrium” that results from autonomous agent interaction need not corre- 
spond to a functional behaviour at society level. So, normative prescriptions have 
to be designed that bias the result of agent interaction in a desired direction. In the 
TRYSA 2 system prescriptions are used to augment the relative importance (or 
“power”) of a problem area, by giving the associated agent the right to enact cer- 
tain signal plans. 

Similar design strategies are currently being investigated by other researchers [6]. The 
ProsA 2 (Pro blem-solving Autonomous Agent) architecture has been developed that 
supports this design strategy and constitutes the basis of the operationalisation of the 
TRYSA 2 system [13]. ProsA 2 is a vertically layered agent architecture [11], reflecting 
the layering principle the different stages of structural co-operation: each layer can be 
designed and tested separately in accordance with the design steps outlined above. 




Figure 4. ProsA 2 agent architecture 
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The general arehiteeture of a ProsA2 agent is depieted in Figure 4. It comprises three 
subsystems. The perception subsystem is endowed with perceptors that capture stim- 
uli of the outside world. In the TRYSA2 system it is in charge of perceiving data from 
the road sensors as well as messages from its acquaintances. 

The result of the perception process is passed to the cognition subsystem, where the 
(problem-solving and deontic state) information models are updated. This new infor- 
mation provided by perceptors can bring them into an “inconsistent” state. The three 
layers of the cognition subsystem are in charge of reacting to these changes by re- 
storing model consistency. Based on their particular layer knowledge, they all run 
different instantiations of the same three phase control loop: first, significant changes 
in the information models are detected (e.g. new data from some sensors); second, the 
reasons for such inconsistency are determined; third, the adequate model updates are 
determined (e.g. generating new local signal plans). Each layer is responsible for 
maintaining the consistency of particular parts of the information models. In TRYSA2 
agents, the individual layer generates alternative signal plans on the basis of available 
traffic data (and ranks them so as to maximise their positive impact in the traffic flow 
of the agent’s local problem area). The social layer sets out from the interdependen- 
cies between the agents’ local signal plans, modifies these local proposals and/or their 
ranking accordingly, and indicates pertinent messages to be sent. On the basis of 
contextual information, the normative layer deduces permissions or prohibitions to 
use certain control devices and, in consequence, to enact certain signal plans. 

Finally, the action subsystem checks for changes in the information models and 
manipulates the agent’s effectors accordingly. In the case of TRYSA2 agents, it is in 
charge of sending messages to other agents and of informing about newly proposed 
road signal plans. 

The core functionality of ProsA2 agents is provided by the knowledge units (KUs) 
[4] of each of its layers. Figure 5 shows the knowledge endowment of the layers of 
TRYSA2 agents. In the sequel, we will describe each layer, paying special attention to 
the contents of each of these KUs and describing how this knowledge is articulated in 
the layers’ reasoning cycles. 
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Figure 5. Knowledge units of the TRYSA2 layers 
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3.2 Individual stage 

The individual stage of the design process corresponds to the TRYSA 2 agents’ local 
problem-solving and is implemented through the individual layer. 

Every couple of minutes, a TRYSA 2 agent receives temporal series of magnitudes 
such as traffic speed, flow and occupancy from the road sensors of its area. This raw 
data is initially pre-processed in order to filter out noisy and erroneous data. Subse- 
quently, data abstraction is performed, calculating aggregate magnitudes such as tem- 
poral and spatial gradients for the different sections. Both tasks are performed by 
means of the physical network structure KU. The agent updates its problem-solving 
model with this completed traffic information. 

In step two, problem identification (and also some part of problem diagnosis) is 
performed by matching the abstracted traffic data against the frames in the problem 
scenario KU. Figure 6 shows one such frame that matches the abstracted traffic data. 
Suppose that as a result of data abstraction low speed and high occupancy are identi- 
fied in Ronda de Dalt en Diagonal and medium to high speed and low occupancy in 
Ronda en d'Eslugues. These facts match the frame shown in Figure 6, so that an inci- 
dent in the central lane of Diagonal road is identified, which manifests itself as a traf- 
fic excess (with respect to the road’s capacity) of 2200 veh/h between Diagonal and 
Llobregat in the Dalt ring-road. Traffic from Collcerola to Llobregat and, in a minor 
degree, ixom Diagonal heading towards Llobregat contributes to this excess. 

Step three, the control recommendation phase, adheres to the following line of rea- 
soning: first, the historic traffic demand between nodes is retrieved and the contribu- 
tion of each path to the problem in the critical section calculated. This is done by 
matching the current abstract traffic state and the state of the control devices against 
the distribution scenario frames. 
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^ 24PIV1 \ 




17P1V1 


• J_ 
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Incident congestion in the central lane 
at Diagonal 

Section: Ronda de Dalt en Diagonal 
speed: low 
occupancy: high 

Section: Ronda de Dalt en d'Esplugues 
speed: medium, high 
occupancy: low 

Critical section : between Ronda de Dalt en Diagonal 
and Ronda de Dalt en d'Espluges 
excess: 2200 veh/h 

From Collcerola to Llobregat -> [60, 80] % 
From Diagonal to Llobregat -> [20, 40] % 



Congestion warning in Ronda de Dalt at Diagonal 

Panel 1 7PIV1 : congestion at Diagonal 

State of Panel 13P1V2 : congestion at Diagonal 

control Panel 8P1V1 : congestion at Diagonal 

devices 

Regulator R1 : contention level medium 

From Collcerola to Llobregat 

through Ronda de Dalt -> [40,60] % 

Paths From Collcerola to Llobregat 
KSg through Can Caralleu -> [30,40] % 

From Collcerola to Llobregat 

through alternative paths -> [10,20] % 

State of From Collcerola to Can Caralleu : free 
control From Can Caralleu to Diagonal : with problems 

zones From Diagonal to Llobregat : with problems 



Figure 6. An example scenario 

Coherent alternative signal plans are generated by using the distribution scenario KU 
onee again: every frame applicable to the eurrent situation is pre-seleeted. Assume 
that this is the ease for the frame shown in Figure 6. Its short term effeets are esti- 
mated by simulating its impact on the current traffic situation. This is done by using 
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the physical network structure knowledge to assign traffic to the road network in 
accordance with the distribution of traffic volume among paths that it specifies. The 
frame in the example specifies that about one half of the traffic volume from Coll- 
cerola to Llobregat will pass through the Dalt ring-road, while a smaller amount 
chooses a path through Can Caralleu or other alternative paths, if the corresponding 
signal plan is set. If the simulation shows a reasonable decrease of excess in the criti- 
cal section, the frame’s signal plan constitutes one recommendation of the system. In 
the example, it is suggested to display congestion warnings at Diagonal for panels 
17PIV1, 13PIV2 and 8PIV1, while setting the contention level of regulator R1 to me- 
dium. 

Signal plan recommendations are ranked according to their utility, i.e. by their ex- 
pected reduction of (local) traffic excess. They constitute the individual action alter- 
natives and are stored in the agent’s information model. 



3.3 Social stage 

In the social stage agents take signal plan interrelation into account, reconsidering 
their choice of alternative local signal plans so as to achieve maximum alleviation for 
congestions in their local problem areas. Plan interrelation can be of logical or physi- 
cal nature: either one local signal plan proposal influences the effectiveness of another 
(e.g. by deviating traffic from one problem, thereby increasing the traffic demand in 
the other, already congested area); or some local signal plans intend to set the same 
control device in different, incompatible states (e.g. by displaying different messages 
at the same VMS). 

For this enterprise, an agent keeps track of the current local signal plans of its ac- 
quaintances in its information model. The three step control cycle of the social layer is 
driven by this information: when messages arrive, indicating that some acquaintance 
has switched to another local signal plan (“value” messages) or that certain combina- 
tions of local plans are illegal (“no-good” messages), the agent checks whether its 
currently most preferred local signal plan is still consistent. If not, it tries to find its 
most preferred local signal plan that re-establishes consistency, and informs interested 
agents about this change by sending “value”-messages. If there is no such plan, it 
sends “no-good” messages, inducing other agents to switch to another local signal 
plan. Once one agent detects that all agents are in a consistent state, it stores the cur- 
rent set of local signal plans as a potential solution. At the end of this process, one 
solution is chosen in accordance with the underlying bargaining model. 

In order to reject or modify inconsistent local plans, the social layer is endowed 
with the plan interrelation KU, which expresses dependencies between plans and 
possible ways of dealing with them in terms of states of control devices. This knowl- 
edge is represented by rules, that obey to the following format: 

[cdev^ , ... , cdev^ ] ^ [cdev^ , ... , cdev. ] or 

[cdev^ , ... , cdev^ ] ^ [nogood ] 

The operational semantics of such a rule determines that the control device states of 
the antecedent can be substituted by those of consequent without any important 
changes in the effect of the signal plans (e.g. by merging different messages “conges- 
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tion at v4” and “congestion at to be displayed on the same VMS into one “conges- 
tion at ^ and message). If control devices are merely incompatible, the consequent 
is the constant nogood. 

The above knowledge defines relations between plans. The agent dependence KU 
hosts knowledge about relations between agents in the shape of rules of the form 

[cdev^ , . . . , cdev^ ] ^ [a^ , ... , ] 

If all control devices cdev\ to cdeVn switch to new states, then this concerns the agents 
a\ to cCyn. Note that these rules actually compile knowledge about the capabilities of an 
agent’s acquaintances upon the background of possible plan relations. For instance, if 
agent a; may set a message Mi on VMS P, and Oj possibly displays a message Mj on 
the same panel, while both messages are incompatible, then the knowledge base 
of ci^’s dependence KU will contain a rule stating that setting on VMS P concerns 
agent aj. The KU serves two purposes: when used with forward inference, it allows an 
agent to deduce which agents are to be informed about changes in its local signal 
plans; using backward inference, it enables an agent to determine its “social strength”, 
by deducing the acquaintances that can affect the executability or the outcome of local 
signal plans. 

From a global point of view, the agent behaviour outlined above implies a distrib- 
uted multi-stage constraint optimisation algorithm, based on ideas of asynchronous 
weak commitment search [17]. In the sequel, we will just sketch this algorithm: In 
stage I, setting out from the local sets of alternative signal plans, agents repeatedly 
exchange messages, so as to determine the set of undominated consistent local signal 
plans. This is done in an asynchronous distributed fashion, that allows for local and 
temporarily incompatible views of the overall state. The agent that detects the termi- 
nation of stage 1, takes the initiative in stage 2. On the basis of the outcome of stage 1 
it computes an (approximate) probability with which each of the sets of consistent 
local signal plans shall be enacted, so as to maximise the product of local agent utili- 
ties (i.e. of local traffic excess reduction). Finally, in stage 3 a set of signal plans is 
selected at random in accordance with the outcome of stage 2, and the agents are 
urged to enact the corresponding local signal plans accordingly. 

In TRYSA 2 , this process shows anytime properties: in time critical situation the 
distributed search phase can be interrupted and a potential signal plan is enacted di- 
rectly. A more detailed analysis of the algorithm can be found in [13]. Note that the 
above procedure allows calculating the outcome of self-interested agent interaction by 
co-operative distributed search. 



3.4 Normative stage 

The normative stage of the design process biases the overall behaviour of TRYSA 2 in 
a desired direction, by letting the normative layer issue prescriptions in relation to 
specific traffic situations. As Figure 5 indicates, the norm KU is in charge of hosting 
and enacting this knowledge. 

In TRYSA 2 , normative situations are supposed to vary with the traffic demand 
structure. In consequence, the norm KU qualifies normative situations temporally by 
means of frames. In the “temporal” section of these frames, the current date and time 
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is classified in terms of two categories: type of day and type of season. Values for the 
former are either Working day, Sunday, Saturday or Holiday. Rules specify how the 
value holiday is derived. The category season may be instantiated to Xmas, Easter, 
Summer, or Normal. Again, the values are related to the current date and time by 
means of rules. The “normative” section describes a normative structure as a function 
of the above information. A normative structure is defined by two categories: prohi- 
bitions and permissions. Each of these categories may be instantiated by a set of con- 
trol device states. 

We require the knowledge bases of the agents’ norm KUs to be globally consistent: 
if one agent is allowed to use a control device, all others that might access it are pro- 
hibited to use it. The associated reasoning method first classifies the current date and 
time, in order to match the resulting temporal categories against the normative pat- 
terns. By means of this method the agent can infer the normative situation which is 
pertinent to it. 



3.5 The system 

The TRYSA2 system has been implemented experimentally on networked worksta- 
tions. The TRYSA2 agents constitute separate Prolog processes (with some extensions 
in C++), which communicate via sockets. The Barcelona test site is simulated by the 
AIMSUN traffic simulator [1]. AIMSUN is endowed with a precise description of the 
traffic management infrastructure at the test site, including detailed models of the road 
network, the sensors, control devices etc., and performs microscopic (“car by car”) 
simulation of traffic flows. A special observer agent has been implemented in Tcl/Tk 
in order to visualise the problem-solving process and its results. Figure 7 shows a 
snapshot of the interface window of the observer agent. 

The dynamics of the system may be illustrated by the typical lines of reasoning 
within TRYSA2, which are determined by the three classes of events that may cause 
agent activation: 

• If new data about the current traffic state arrives, an agent’s individual layer gener- 
ates a set of local signal plans. The change in the information model triggers the 
social layer which starts a social interaction process. Messages are sent and re- 
ceived in accordance with the distributed algorithm outlined above, and the agents 
adapt their local signal plans accordingly. 

• The normative layer deduces a new normative situation, when the current temporal 
context has changed. If necessary, the acquaintances are informed about this, oth- 
erwise the local information model is updated. If the social layer detects a change 
in the set of potential signal plans, it initiates a social interaction process as above. 

• Finally, when messages from an agent’s acquaintances are received, the corre- 
sponding changes in the information model trigger the social layer. The layers re- 
acts to this by restoring local consistency and sending messages. Again, agent be- 
haviour follows the distributed algorithm outlined above. 
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4. Conclusions 

In this paper we have presented the architeeture of the TRYSA 2 distributed decision 
support system that has been designed experimentally for the management of the 
urban motorway network around Barcelona. We have reported our design strategy for 
this type of system and presented an agent architecture to operationalise it. This ap- 
proach lead to a knowledge-based multiagent system that combines deliberate local 
problem-solving with emergent co-ordination. 

The TRYSA 2 systems shows that the principled design of distributed DSSs on the 
basis of a cognitive multiagent architecture is feasible and adequate. By comparing 
this approach, which relies on decentralised, “emergent” co-ordination model, to the 
original TRYS system, we conclude that the present approach promotes scalability. 
When a new acquaintance enters the system, agents still need to be informed about its 
capabilities and, if the newcomer may enact previously unknown signal plans, the 
interrelation of these plans with existing control actions is to be added to the agents’ 
knowledge bases. Still, the introduction of the new agents produces a shift in the so- 
cial equilibrium, leading to a new base-line co-ordination without any further modifi- 
cations to the agent knowledge. In the centralised approach, however, this effect can 
only be achieved by completely reconsidering the priority relations that a distin- 
guished co-ordinator agent is endowed with. 

In future work we will further refine the normative knowledge within the TRYSA 2 
system. The different effects of particular types of prescriptions in real-world traffic 
situations will be examined by means of experimental studies. In addition, we are 
thinking of applying multiagent learning techniques to this task. 
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Abstract. We have designed several new lazy learning algorithms for 
learning problems with many binary features and classes. This particular 
type of learning task can be found in many machine learning applications 
but is of special importance for machine learning of natural language. 
Besides pure instance-based learning we also consider prototype-based 
learning, which has the big advantage of a large reduction of the required 
memory and processing time for classification. As an application for our 
learning algorithms we have chosen natural language database interfaces. 
In our interface architecture the machine learning module replaces an 
elaborate semantic analysis component. The learning task is to select 
the correct command class based on semantic features extracted from 
the user input. We use an existing German natural language interface 
to a production planning and control system as a case study for our 
evaluation and compare the results achieved by the different lazy learning 
algorithms. 



1 Introduction 

In this paper we introduce several new lazy learning algorithms, which are es- 
pecially useful for learning problems with a large number of binary features and 
classes. We define binary features as features which possess only the two values 
0 and 1. A value of 1 encodes the situation that an instance contains the feature; 
otherwise the value equals 0. 

Such learning tasks are typical for machine learning of natural language but 
also can be found in many other applications. In particular for machine learning 
of natural language there exists some empirical evidence [16] that the abstrac- 
tions achieved by using model-based algorithms adds no additional predictive 
power. On the contrary, even limited forms of generalization can harm the per- 
formance of the algorithm due to the many subregularities and exceptions that 
are characteristic of linguistic problems. 

In our work we consider two subgroups of lazy learning: instance-based learn- 
ing and prototype-based learning. Instance-based learning approaches represent 
the learned knowledge simply as collection of training cases or instances. A new 
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case is classified by finding the instance with the highest similarity and by using 
its class as prediction [9]. Instance-based algorithms need only a very small train- 
ing effort but a large amount of memory and processing time for classification 
because the algorithm has to compare new cases with all existing instances. 

An alternative to instance-based approaches is prototype-based learning^ 
which creates a prototype for each class during training [10,5]. These prototypes 
are then used for the comparison with new cases. This has the big advantage 
that there is no longer the necessity to store any training instances as learned 
knowledge. In addition, the number of required comparisons during classification 
is reduced to the number of existing classes. 

We have developed the lazy learning algorithms as part of our machine learn- 
ing workbench, which also includes several algorithms for decision tree learn- 
ing, rule-based learning, and hybrid approaches [15]. All of these algorithms 
have been implemented by means of the deductive object-oriented database 
ROCK & ROLL [3]. The use of the available powerful programming language en- 
ables the efficient implementation of a large variety of different learning 
paradigms. It also gives the user a convenient integrated tool, which assists him 
in applying the algorithms to the data collection stored in the same database 
(see also [6]). 

The learning task we chose for our algorithms concerns natural language 
database interfaees. One of the main obstacles to the efficient use of natural 
language interfaces is the often required high amount of manual knowledge en- 
gineering (see [2] for a recent survey). This time-consuming and tedious process 
is often referred to as the infamous “knowledge acquisition bottleneck” . It may 
require extensive efforts by experts highly experienced in linguistics as well as 
in the domain and the task [8]. Therefore, natural language interfaces represent 
a domain that is very well suited for the application of lazy learning algorithms 
to automate the acquisition process of linguistic knowledge. 

The rest of the paper is organized as follows. First, we introduce the lazy 
learning algorithms in detail before we present the learning task: the application 
of machine learning to natural language database interfaces. Finally, we explain 
the set-up of an extensive case study and discuss the results from the evaluation. 

2 Lazy Learning Algorithms 

2.1 Instance-Based Learning 

The different proposed instance-based learning algorithms vary in how they as- 
sess the similarity (or distance) between two instances. Two very popular meth- 
ods are IBl [1] and IBl-IG [4]. IBl applies the simple approach of treating all 
features as equally important whereas IBl-IG uses the information gain [7] of 
the features as weighting function to take account of the different relevance of 
the individual features. 

Besides implementing these two benchmark algorithms, we have developed a 
new algorithm called BIN- CAT for binary features with class-dependent weight- 
ing and asymmetric treatment of the feature values. In BIN- CAT we calculate 
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the similarity between a new case X and a training case Y using the following 
formula: 



n 

SIM{X, Y) = (A, Cy) -Wi-a yi) - 

i=l 

n 

(A, Cy) ■ Wi ■ Sy {xi, Vi) - 

i=l 

n 

Y^[l-p{Di,CY)]-Wi-5x{xi,yi) . ( 1 ) 

Here, n indicates the number of features, Di the collection of those instances 
that have value 1 for the ith feature, and Cy the class of the training case Y . 
The term Cy) then denotes the proportion of instances in Di that belong 

to class Cy to the total number of training cases for Cy. a{xi^yi)^ 5y{xi^yi)^ 
and 5x{xi,yi) are determined as follows: 



^{xi,yi) 



(xi,yi) 



Sx (xi,yi) 



1 if Xi = 1 A yi = 1 

0 otherwise 

1 if Xi = 0 Ayi = 1 
0 otherwise 

lif Xi = I Ayi = f) 
0 otherwise 



( 2 ) 



SO that we rate the second sum in (1) higher for a larger number of occurrences 
of feature i for class Cy whereas we rate the third sum lower. In other words, 
if the training case Y contains a certain feature but the new case X does not, 
then we rate the difference the stronger the more often the feature occurs for 
class Cy. For features occurring in the case X but not in Y we apply the opposite 
principle. 

Finally, Wi represents the weight of feature i. We calculate its value by intro- 
ducing the following weighting function: 

= - • T 1 - 4 -p(A, j) • [1 -p(A,i)] • (3) 

C 

J = 1 

The term under the summation symbol represents the selectivity of feature i for 
class j. It equals 1 if either all or none of the instances have value 1 for this 
feature. In this case, all instances for class j either possess or do not possess 
this feature, which makes it a very discriminating characteristic. The opposite 
extreme is that p{DiC) equals 50% because then the feature possesses no infor- 
mation for the prediction of the class and the term under the summation symbol 
becomes 0. 
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2.2 Prototype-Based Learning 

We have developed the prototype-based algorithm BIN -PRO for binary features. 
For each class C we compute \Dc\^ the number of instances that belong to 
class C, as well as \Dcj\- The latter restricts the set of instances Dc to those 
instances that have value 1 for feature /. 

As similarity function between a new case X and a class C we introduce the 
following formula: 



In this formula we consider both features that are present in the new case X 
and features that are important for class C but missing in X. However, we give 
more emphasis to the former by dividing the second sum by \Dq\- As weighting 
function Wf we use (3) again. 

To further improve the performance of BIN-PRO in comparison with BIN- 
CAT (see Sect. 4), we have added an iterative adaptive component to BIN-PRO. 
The resulting algorithm BIN-PI A iteratively adapts the values \Dcj\ to increase 
the fitness of the algorithm, i.e. the proportion of correctly classified training 
instances. 

Figure 1 shows the applied algorithm in detail. For each wrong classification 
of a training instance X and all features / G X, it decrements for 

the wrong predicted class Cpred by 1 and increments for the correct 

class Ccorr by 1. This adaptation of the \Dcj\ is repeated in several iterations 
for all training instances until either the fitness reaches 100 % or the number of 
iterations exceeds a certain limit. 

3 Learning Task 

The learning task in natural language database interfaces is to select the cor- 
rect command class based on semantic features extracted from the user input. 
Therefore, it can be modeled as classification problem with a large number of 
binary features and classes. For that purpose we have developed the interface 
architecture displayed in Fig. 2. The morpho -lexical analyzer transforms the user 
input into a deep form list (DFL)^ which indicates for each word token its surface 
form, category, and semantic deep form (see [12] for more details). 

For database interfaces unknown values contained in the input possess par- 
ticular importance for the meaning of a command [13]. Therefore, we treat the 
unknown values separately in the unknown value list (UVL) analyzer. This mod- 
ule checks the data type of unknown values and looks them up in the database 
to find out whether they represent identifiers of existing entities. In such a case 



SIM(X, C) = E \Dcj \ ■ Wf 

fex 
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Program Iterative-Adaptation 
begin 

repeat 

set number of correctly classified instances IDcorrI to 0; 
foreach instance Xin training collection Ddo 

begin 

calculate predicted class C^red with maximum similarity according to (4); 
■f ^pred = correct class Ccorr then 
increment ID^^q^^I; 

eise 

foreach feature f do 
if fe Xthen 
begin 

decrement I ^,fl; 

(-^pred 

increment 

end 

end 

assign IDcorrI / \D\ to fitness; 
untii fitness = 1 .0 or number of iterations > limit; 

end 



Fig. 1. Algorithm for iterative adaptation 



the entity type is indicated in the resulting UVL, otherwise we use the data 
type instead. UVL and DFL represent the input to the machine learning (ML) 
classifier. It assigns a ranked command class list (CCL) to the input sentence 
according to the learned classification knowledge. As last step we use the CCL 
and UVL to generate database commands. 

For the encoding of the training data we only make use of the semantic deep 
forms contained in the DFL. We use English concepts as deep forms and map 
them to binary features, i.e. a certain feature has the value 1 if the deep form 
is a member of the DFL, otherwise it equals 0. For the elements of the UVL we 
apply a more detailed encoding, which maps the number and the type to binary 
features. Figure 3 shows an example of the feature encoding for a German input 
sentence. Besides a German morpho- lexical analyzer, we also developed modules 
for the processing of English and Japanese input [14]. All components of the 
interface architecture are implemented in ROCK & ROLL by taking advantage 
of the available powerful deductive object-oriented programming language. 

4 Evaluation 

As a case study for the evaluation of the lazy learning algorithms we use them 
within a German natural language interface to a production planning and control 
system (PPC). The task of the PPG is the mean-term scheduling of products and 
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User input 




Database command 



Fig. 2. Interface architecture 



resources involved in the manufacturing processes, i.e. material, machines, and 
labor. The output of the PPG is the master production schedule, which is needed 
for the coordination of related business services, e.g. engineering, manufacturing, 
and finance. The modeled enterprise produces precision tools by applying job 
order production and serial manufacture as basic strategies. 

The efficient realization of the high demands of this application exceeds by 
far the available power of relational database technology. Therefore, the PPG 
represents an excellent choice for taking full advantage of the extended function- 
ality of deductive object-oriented database systems. Furthermore, the complex 
requirements justify the effective use of a natural language front-end. 

During previous research [11] we developed a natural language interface based 
on 1000 input sentences, which had been collected from users by means of ques- 
tionnaires. The input sentences were then mapped to 100 command classes (10 
sentences for each class). The mapping was performed by elaborate semantic 
analysis; for the development of the underlying rule base we spent several man- 
months. 

Therefore, we were eager to see if we could replace this extensive effort by a 
machine learning module, which learns the same linguistic knowledge automati- 
cally. As result of the encoding of the complete data collection of 1000 sentences 
we identified the large number of 317 features, 290 for the DFL and 27 for the 
UVL. For the evaluation of the different machine learning algorithms we used 
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User 

input 


St 37 H kostet nun 1.7 Schilling 
(St 37 H costs now 1.7 Schilling) 


DFL 


cost, now, schilling 


UVL 


1 material, 1 real 



Fig. 3. Example of feature encoding 



the success rate and the top-3 rate as performance measures. The success rate 
indicates the proportion of correctly classified test cases whereas the top-3 rate 
shows the proportion of cases for which the correct classification is among the 
first three predicted classes. 

We performed 10-fold cross-validation to test the machine learning algo- 
rithms’ predictive accuracy. This means that we randomly divided the data 
collection into 10 equal blocks. Each block in turn was used as test set whereas 
the remaining blocks formed the training set. Table 1 shows the achieved mean 
and standard deviation of the success rate and top-3 rate for the 5 different 
learning algorithms. 



Table 1. Test results 





SUCCESS RATE 


TOP-3 RATE 




MEAN 


STDEV 


MEAN STDEV 


IBl 


81.3% 


2.50 % 


94.5 % 


1.65% 


IBl-IG 


89.0 % 


3.71 % 


98.4% 


1.17% 


BIN-CAT 95.2% 


1.62% 


99.9 % 


0.32 % 


BIN-PRO 92.5% 


3.17% 


99.6 % 


0.70 % 


BIN-PIA 


94.7% 


3.40 % 


99.7% 


0.67% 



The test for the statistical significance of the performance differences between 
the different algorithms resulted in the significance matrix displayed in Table 2 
(for a significance level of 5%). Eor each cell of the matrix it shows the signif- 
icance of the performance difference between the algorithm written as column 
label Ac and the algorithm written as row label Aji: 

1. +: Ac is significantly better than Ar^ 

2. Ac is significantly worse than Ar^ 

3. there is no significant difference between Ac and ^4^^. 

Each entry shows first the result for the success rate and then for the top-3 
rate, e.g. the entry +/^ between BIN-CAT and BIN-PRO means that BIN-CAT 
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Table 2. Significance matrix 



IBl IBl-IG BIN-CAT BIN-PRO BIN-PIA 



IBl 


+/+ 


+/+ 


+/+ 


+/+ 


IBI-IG -/- 




+/+ 


+/+ 


+/+ 


BIN-CAT -/- 


-/- 




-h 




BIN-PRO -/- 




-l-/~ 




j 


BIN-PIA -/- 


-/- 









did significantly better than BIN-PRO regarding the success rate but that there 
was no significant difference with respect to the top-3 rate. 

Figure 4 gives a clearer picture by ranking the algorithms according to the 
achieved success and top-3 rates. In this figure the brackets represent a significant 
difference between two algorithms. In other words, all algorithms inside of a 
bracket possess no significant difference, e.g. the first bracket on the left shows 
that BIN- CAT did significantly better than BIN-PRO but that there were no 
significant differences between BIN-CAT and BIN-PIA or between BIN-PIA and 
BIN-PRO. 



BIN-CAT 
BIN-PIA 
BIN-PRO 
IBI-IG 
IB I 



^ Success rate J ^ Top-3 rate 

C 

C 



BIN-CAT 
BIN-PIA 
I — BIN-PRO 
" — IBI-IG 
IB I 



c 



Fig. 4. Ranking of algorithms 



If we compare the results for the different algorithms, we can see that the 
three algorithms of the BIN group clearly outperform IBl and IBI-IG. BIN-CAT 
is only significantly better than BIN-PRO concerning the success rate whereas 
BIN-PIA shows no significant inferiority with respect to both performance mea- 
sures. This outstanding performance of both BIN-PIA and BIN-PRO is remark- 
able if one considers the much more condensed representation of the learned 
knowledge by the use of prototypes. 

To estimate the computational overhead of BIN-PIA in comparison with 
BIN-PRO we monitored the number of required iterations, which was only 3.5 
on average. Figure 5 plots the average fitness of BIN-PIA after each iteration; it 
shows the excellent convergence of the algorithm. 
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Iterations 



Fig. 5. Fitness function of BIN-PIA algorithm 



In the last part of our evaluation we generated learning functions, which show 
the decrease of the success rate (Fig. 6) and the top-3 rate (Fig. 7) for a smaller 
number of used blocks during training. Table 3 indicates the number of blocks for 
which we observed a significant reduction of the performance measures. These 
results suggest that a smaller number of training examples would be sufficient 
for this type of application. 

Finally, we also analyzed the standard deviation of the success rate (Fig. 8) 
and top-3 rate (Fig. 9) as function of the number of used blocks during training, 
which shows the expected general trend of the standard deviation to increase for 
a smaller number of blocks. 

5 Conclusion 

Our empirical results were surprisingly good if one considers the complexity of 
the task, which included many similar classes that were very difficult to distin- 
guish even for human experts. In any case, we could show that lazy learning 
algorithms represent a sound alternative to manual knowledge acquisition for 
the application in natural language database interfaces. 

The results also show that the prototype-based algorithms are competitive 
with instance-based learning. By using the technique of iterative adaptation we 
could observe results that showed no significant inferiority to the best instance- 
based algorithm BIN- CAT. This behavior of BIN-PIA is remarkable if one con- 
siders the large reduction of required memory and processing time for classifica- 
tion by the use of prototypes. 

Future work will concentrate on the important point of testing our learning 
algorithms on standard benchmark machine learning datasets and other typical 
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datasets for machine learning of natural language. This will provide more exper- 
imental evidence to show that the presented methods are generally valid also in 
other learning contexts with many binary features and classes. 



Table 3. Limits for significant reduction 



SUCCESS RATE TOP-3 RATE 


IBl 


6 


6 


IBl-IG 


6 


7 


BIN-CAT 


5 


5 


BIN-PRO 


4 


3 


BIN-PIA 


5 


3 




Fig. 8. Standard deviation of success rate 
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Fig. 9. Standard deviation of top-3 rate 
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Abstract. In this work a measure called GD is presented for attribute 
selection. This measure is defined between an attribute set and a class 
and corresponds to a generalization of the Mantaras distance that allows 
to detect the interdependencies between attributes. In the same way, 
the proposed measure allows to order the attributes by importance in 
the definition of the concept. This measure does not exhibit a notice- 
able bias in favor of attributes with many values. The quality of the 
selected attributes using the GD measure is tested by means of different 
comparisons with other two attribute selection methods over 19 datasets. 

Keywords: Machine learning, Intelligent information retrieval. Feature 
selection 



1 Introduction 

In many Machine Learning problems, the induction algorithms have to deal with 
attributes that are not relevant to the definition of the class. The irrelevant or 
redundant attributes do not affect the ideal Bayesian classifier because the addi- 
tion of new attributes never decreases the performance of the classifier. However, 
many practical classifiers decrease its performance when irrelevant or redundant 
attributes arise. To overcome this problem, different approaches have been pro- 
posed to select the more relevant attributes that define a class. Some works on 
attribute selection were the WINNOW algorithm proposed by Littlestone [15], 
the FOCUS algorithm proposed by Almuallim and Dietterich [3] and the Relief 
algorithm proposed by Kira and Rendell [11]. All these algorithms share as a 
common characteristic that they do not include the performance of the classifier 
as a measure to guide the selection of the attributes. John et al. [10] propose 
the wrapper approach that utilizes the performance of the classifier to carry out 
the selection of the attributes. There is much evidence that wrapper method 
give good results [1,10]. However, due to its computational cost, wrapper meth- 
ods can only be applied in combination with classifiers of low complexity. An 
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intermediate approach proposed by Scherf and Brauer [21] performs the feature 
selection in two steps. The first step is a filter approach whose result is a set of 
different attribute subsets and the second step is a wrapper approach over the 
resultant subsets in the first step. 

The method proposed in this work utilizes a measure based on Information 
Theory to guide the selection of the attributes. The use of concepts of Informa- 
tion Theory in feature selection is not recent. Quinlan [20] proposed a measure 
called Gain Ratio that corresponds to the ratio between mutual information and 
the entropy [24] . Another approach based on Information Theory is proposed by 
Lopez de Mantaras [16], where a distance measure is defined and its relationship 
with the Gain Ratio measure proposed by Quinlan is analyzed. Wettschereck 
and Dieterich [23] demonstrated that the performance of the k-NN and Nearest- 
Hyperrectangle classifiers increases when the attributes are weighted by mutual 
information. Daelemans [7] reaches similar conclusions when the features used 
in the Exemplar-based Generalization algorithm are weighted with the mutual 
information in a problem of assignment of syllable boundaries in Dutch. 

In the previous works the attributes are considered independent, they do not 
take into account the possible relations between them. In this work a measure 
called GD, between an attribute subset and the class is proposed. This measure, 
unlike the Gain Ratio, tries to get the possible interdependence among attributes 
and is based on a quadratic form of the distance proposed by Lopez de Mantaras 
and a matrix called Transinformation Matrix. 

The organization of the rest of this paper is as follows. In section 2 some 
concepts of Information Theory are reviewed. Then in section 3, the distance 
proposed by Lopez de Mantaras is analyzed. The GD measure is defined in 
section 4, and a comparison with two other selection attribute methods on several 
datasets is presented in section 5. The selection of the attributes is carried out 
from a set of labeled samples. Each sample of the data set is composed of a n 
dimensional vector of attributes X = {Xi, X 2 , . . . , and a label Y which 
indicates the class the sample belongs to. 

2 Review of Some Concepts on Information Theory 

As the measure we propose in this work is based on Information Theory, before 
introducing the measure itself, a review of some previous concepts of the Infor- 
mation Theory is included. The different concepts will refer to the attributes 
and the class because they can be considered as random variables and so the 
concepts can be defined on them. 

Let H{Xi) the entropy of attribute Xi with values {x \., . . . , with defini- 
tion: 

k 

F(Xi) = -^P(4)logP(4) (1) 

2 = 1 

where P{x^) is the probability that value occurs. According to the expression, 
the entropy measures the average of uncertainty of the attribute and it is non 
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negative [6]. In the same way that we have defined the entropy of an attribute, 
we can define the entropy of the class Y. 

When one attribute is known, the amount of uncertainty of the class is de- 
creased. A measure that reveals the information given by an attribute Xi over 
a class Y is the mutual information, I{Xi;Y). The expression of the mutual 
information is: 



In equation (2), H{Y/Xi) represents the entropy of Y when Xi is known. 
This entropy is called conditional entropy and it is non negative and less or 
equal to H{Y) [6], so the mutual information is greater or equal to zero and 
commutative. 

From the joint entropy of an attribute and the class H(Xi^ Y) and from the 
mutual information I{Xi;Y), the entropy distance [18] is defined as: 



The entropy distance measures the information that the attribute Xi gives 
about a class. Because, the more information the attribute gives, the greater the 
mutual information is and therefore the smaller the distance is. 

3 Mantaras Distance 

A measure that is conceptually very close to the measure proposed in this work 
is the Mantaras distance proposed by Lopez de Mantaras [16], so we are going 
to give a short description of it before to define the GD distance. 

The Mantaras distance is a distance measure between two partitions to select 
the attributes associated with the nodes of a decision tree. In each node, it is 
chosen the attribute that produces the partition closest to the correct partition 
of the samples subset in the examples. 

The Mantaras distance has the following expression: 



Where H{Pa/Pb) and H{Pb/Pa) correspond to the entropy in each par- 
tition when the another is known. It is possible to demonstrate the following 
properties [16]: 




d{X,,Y) = H{X,,Y)-I{X,;Y) 



(3) 



(Ilm{Pa,Pb) = H{Pa/Pb) + H{Pb/Pa) 



(4) 



1. dLM{PA, Pb) > 0 and equal iff Pa = Pb 

2 . dLM{PA, Pb) = dLM{PB, Pa) 

3 . dLM{PA, Pb) p dLM{PBi Pc) ^ dLM{PA, Pc) 
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If we change the references to partitions by attributes and class in the defini- 
tion of the Mantaras distance, we get the following expression for the Mantaras 
distance 



dLMiXi, Y) = H{Xi/Y) + H{Y/Xi) (5) 

and using equation (2) and the relation 

H{Y/Xi) = H(Y, X,) - H{Xi) (6) 

equation (5) can be transformed in 

dLM{X,, Y) = y) - /(X,; F) = d(X„ F) (7) 

which corresponds to the expression of the entropy distance and shows that the 
entropy distance is a metric distance function. 



4 GD Measure 



The use of the Gain Ratio and the Mantaras distance has the drawback of 
operating over isolated attributes. Therefore, these methods do not detect the 
possible dependencies that there could be between attributes. A manner to get 
into account the interdependencies between attributes is to compute the mutual 
information for each pair of attributes I{Xi]Xj). These interdependencies of 
attributes can be represented with the aid of the Transinformation Matrix T, a 
square matrix of dimension n (number of attributes) where each element ti^j of 
the matrix is the mutual information between attributes i-th and j-th. 



Some properties hold for this matrix whose demonstrations can be found in [17]. 



1- ti^i > tij^ z, j = 1 . . .n and i ^ j 

2- ti j ^ 0, = 1...77/ 

3 - ti,j — — 1 . . . 77 / 

Proposition!. Given an attribute set {Xi, X2, . . . , and its assoeiated 
transinformation matrix T, if for any row i it is established that 



3 j • ti i — ti^j 

Then the attribute Xj is redundant with respeet to Xi and it ean be removed 
from the set without any information lost. 

Proof. From the definition of the transinformation matrix and the expression (2) 
of the mutual information: 



t,,, = 7(X„X,) = H{X,) - H{X,/X,) = H{X,) 
Uj = /(X„X,) = H{X,) - H{X,lXj) 
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So^ if — t' 






/(X,; Xi) = I{Xi- Xj) ^ H{Xi) = H{Xi) - H{XilXj) ^ H{Xi/Xj) = 0 (8) 

H{XijXj) = 0 means that the knowledge of Xj decreases to zero the uncertainty 
of Xi^ and therefore attribute Xj holds the whole information over Xi and one 
of them can be removed without any information lost. 

Once the transinformation matrix has been defined, it is necessary to find 
an expression for the GD measure that includes the transinformation matrix 
and the distance (3). This expression must be defined in such a way that sub- 
sets of attributes with high dependencies between attributes yield lower values 
than other ones without these high dependencies. A solution comes from the 
analogy to significance level between the transinformation matrix and the co- 
variance matrix (X) of two random variables. This analogy can be established 
because both matrices measure interrelation between variables. In the Maha- 
lanobis distance [9], the covariance matrix is utilized to correct the effects of 
cross covariances between two components of a random variable. The expression 
of the Mahalanobis distance [9] between two samples (X,Y) of a random variable 
is: 



dMahalauoHs{X, Y) = {X - Yf S~\X - Y) (9) 

where dMahaianobisi^^ corrcsponds to the Euclidean distance if X is the iden- 
tity matrix. 

Therefore the GD measure can be defined in a similar way to the Mahalanobis 
distance, using the transinformation matrix instead of the covariance matrix and 
the distance (3) instead of the Euclidean distance. The GD measure dGD{X,Y) 
between the set of attributes X and the class Y is expressed as: 

dcniX, Y) = D(X, YYT-^D{X, Y) (10) 

where D{X^Y) = [dLM(Xi, T), . . . , T)]^ is a vector whose i-th ele- 

ment is the Mantaras distance (equivalent to the entropy distance) between the 
attribute Xi and the class, and T is the transinformation matrix of the set of 
attributes X. Erom the equation (3) we can observe that the elements of the 
D(X,Y) vector are smaller as the information that the attribute gives about the 
class is greater. 

Given a set of attributes and the associated transinformation matrix, the GD 
measure fulfills the following properties: 

1. dcD{X,Y) > 0, VX,y and dcD^X.X) = 0 

2 . dGD{X,Y) = dGD{Y,X),YX,Y 

The demonstration of the two previous properties is trivial if we take into 
account the properties of the Mantaras distance and the properties of the transin- 
formation matrix. The triangle inequality property has not been demonstrated 
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Table 1. Results of experiments to detect the influence of the number of values 
of an attribute 



Data Set 




GR 


I-^lm 


dcD 




m=2 


0.0012 


0.0006 


3.9903 


1 


m=5 


0.0021 (75%) 


0.0015 (150%) 


4.7216 (18%) 




m=10 


0.0032 (166%) 


0.0025 (316%) 


5.5559 (39%) 




m=2 


0.0012 


0.0007 


2.9563 


2 


m=5 


0.0021 (75%) 


0.0016 (128%) 


3.9559 (33%) 




m=10 


0.0033 (175%) 


0.0027 (285%) 


4.8589 ^4%) 




m=2 


0.0049 


0.0015 


10.9777 


3 


m=5 


0.0084 (71%) 


0.0042 (180%) 


9.1349 (-16%) 




m=10 


0.0133 (171%) 


0.0079 426%) 


9.2887 (-15%) 



for the GD measure yet, and so it can be considered as a semi- metric distance 
function [4]. 

The GD measure satisfies the monotonicity property that states that the 
distance increases with dimensionality. Therefore, only subsets with the same 
cardinality can be compared between them. 

After the redundant attributes have been filtered according to proposition 1, 
the use of GD measure for feature selection is based on the fact that the dis- 
tances d{Xi^Y) decreases as the information of an attribute subset about the 
class increases. On the other hand, if an element of the transinformation matrix 
is large (it indicates that the interdependence between two attributes is high) 
then the GD measure increases. Therefore it can be concluded that lower val- 
ues of GD measure between an attribute subset and the class indicate that the 
attributes give a lot of information about the class and that there is no high 
interdependencies between the attributes. 

In the GD measure an important aspect is the singularity of the transinforma- 
tion matrix. In [17] has been analytically demonstrated that the transinformation 
matrix is non singular for dimension two and three. An analytical demonstration 
have not been found for greater dimensions yet, but all the matrices generated 
in the examples of section 5 were found non-singular. 

The GD measure does not exhibit a noticeable bias in favor of attributes with 
large numbers of values as Gain Ratio and Mantaras distance do [24]. To test the 
previous statement, we performed the experiments presented in [24] by White 
and Liu. These experiments consist of three synthetic data set, each one with 
three attributes with 2, 5 and 10 values respectively. The attributes have not any 
relation with the classes which are distributed in each data set as follows: two 
equiprobable classes, two classes with an odds ratio of 4:1 and five equiprobable 
classes. 

Table 1 shows the obtained values of the Gain Ratio {GR)^ the Mantaras 
distance (cIlm) and the GD measure (dcD)- In the table the relative increment 
with respect to the smallest dimension appears in brackets. The relative incre- 
ment of the GD measure is low unlike the Gain Ratio and the Mantaras distance 
that have a relative increment of two orders of magnitude. 
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Table 2. Results obtained with the Naive Bayes classifier 





dcD 


ReliefF 


d,LM 




# attr. 


Acc. 


# attr. 


Acc. 


Concl. 


# attr. 


Acc. 


Concl. 


BW 


10 


96.30i0.22 


7 


96.74i0.23 


< 


10 


96.30i0.22 


= 


CL 


8 


48.08il.04 


8 


48.08il.04 


= 


3 


49.18il.10 


= 


G2 


4 


69.92il.l0 


4 


65.llil.10 


> 


2 


61.67il.l0 


> 


HD 


8 


85.04i0.62 


11 


85.45i0.65 


= 


12 


85.15i0.62 


= 


IR 


2 


96.00i0.49 


2 


96.00i0.49 


= 


2 


96.00i0.49 


= 


LD 


6 


55.58i0.82 


3 


59.45i0.91 


< 


6 


55.58i0.82 


= 


PI 


6 


75.67i0.46 


2 


76.60i0.50 


< 


4 


75.70i0.47 


= 


WI 


6 


97.53i0.34 


13 


97.43i0.34 




12 


97.59i0.35 




PO 


1 


70.49il.68 


1 


71.18il.64 


< 


1 


69.47il.67 


> 


TT 


6 


72.12i0.43 


6 


73.00i0.43 


< 


6 


73.02i0.44 


< 


VO 


1 


95.63i0.28 


1 


95.63i0.28 




1 


95.63i0.28 


= 


LE 


7 


74.99i0.44 


7 


74.99i0.44 


= 


7 


74.99i0.44 


= 


P5 


1 


45.13i0.24 


1 


44.84i0.24 


= 


1 


45.13i0.24 




BC 


4 


74.09i0.80 


8 


74.13i0.73 


= 


7 


74.05i0.79 


= 


CR 


1 


86.37i0.37 


1 


86.37i0.37 


= 


1 


86.37i0.37 


= 


Ml 


1 


75.00i0.60 


3 


75.00i0.60 


= 


3 


75.00i0.60 


= 


M2 


1 


67.14i0.74 


1 


67.14i0.74 


= 


1 


67.14i0.74 


= 


M3 


2 


97.22i0.23 


2 


97.22i0.23 


= 


4 


97.22i0.23 


= 


ZO 


7 


93.65i0.74 


8 


93.75i0.73 


= 


8 


93.75i0.73 


= 



5 Experiments 

In this section we compare the quality of the selected attributes by the GD mea- 
sure with the selected attributes by other two methods. The other two methods 
chosen in this comparative study are the Mantaras distance and the ReliefF 
method. On the one hand, the Mantaras distance has been chosen because it 
has a conceptual resemblance with the proposed method. On the other hand, the 
ReliefF method has been chosen because it has been widely referenced in the bib- 
liography [5,22,12]. The ReliefF method is a version of the Relief method due to 
Kononenko [14] that permits attributes with missing values and multiclass prob- 
lems. The quality of each selected attribute was tested by means of the accuracy 
that three classifiers yields. As we are interested in comparing the selection meth- 
ods, we do not make any optimization in the classifiers to avoid the introduction 
of a bias in the accuracy due to the optimizations. The classifiers were: the Naive 
Bayes classifier [9], a decision tree induced with the IDS method [20] and the IBl 
algorithm [2]. The implementation of the induction algorithms was done using 
the library [13] and the comparative was performed with 19 databases 

of the UCI Machine Learning Databases Repository [19]. The databases used in 
the experiments were the following ones: Breast Cancer Ljubljana (BC), Breast 
Cancer Wisconsin (BW), Credit Card (CR), Glass (CL), Glass2 (G2), Heart 
Disease (HD), Iris (IR), Led (LE), Liver Disorder (LD), Monkl, Monk2, MonkS 
(Ml, M2, M3), Parity5+5 (P5), Pima Indian Diabetes (PI), Post-operative (PO), 
Tic-Tac-Toe (TT), Voting (VO), Wine (WI), Zoo (ZO). 
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Table 3. Results obtained with the IDS classifier 





dcD 


ReliefF 


dhM 




^ attr. 


Acc. 


# attr. 


Acc. 


Goncl. 


# attr. 


Acc. 


Goncl. 


BW 


2 


95.32T0.23 


3 


95.46i0.23 


= 


10 


94.72i0.29 


> 


GL 


6 


69.72i0.96 


6 


69.72i0.96 


= 


8 


69.27i0.98 


= 


G2 


5 


82.56i0.96 


3 


83.42i0.95 


= 


6 


88.20i0.81 2 


< 


HD 


4 


79.07i0.71 


5 


81.41i0.72 


< 


8 


78.96i0.75 


= 


IR 


2 


95.93i0.49 


2 


95.93i0.49 


= 


2 


95.93i0.49 


= 


LD 


4 


64.37i0.78 


4 


64.37i0.78 


= 


5 


63.51i0.75 


= 


PI 


8 


70.65i0.49 


8 


70.65i0.49 


= 


8 


70.65i0.49 


= 


WI 


3 


94.61i0.49 


5 


95.28i0.53 




3 


94.61i0.49 


= 


PO 


4 


70.56il.62 


1 


71.18il.64 


= 


1 


69.47il.67 


= 


TT 


9 


85.46i0.33 


9 


85.46i0.33 


= 


9 


85.46i0.33 


= 


VO 


1 


95.63i0.28 


1 


95.63i0.28 


= 


1 


95.63i0.28 




LE 


7 


73.44i0.42 


7 


73.44i0.42 


= 


7 


73.44i0.42 


= 


P5 


8 


99.96i0.04 


5 


lOO.OiO.OO 


= 


8 


99.96i0.04 




BG 


4 


75.77i0.70 


1 


72.50i0.80 


> 


4 


75.05i0.77 


= 


GR 


1 


86.37i0.37 


1 


86.37i0.37 


= 


1 


86.37i0.37 


= 


Ml 


5 


99.93i0.07 


3 


lOO.OiO.OO 


= 


5 


99.93i0.07 


= 


M2 


5 


77.80i0.69 


5 


77.80i0.69 


= 


5 


77.80i0.69 


= 


M3 


5 


lOO.OiO.OO 


6 


lOO.OiO.OO 


= 


5 


lOO.OiO.OO 


= 


ZO 


9 


96.42i0.59 


8 


96.03i0.60 


= 


8 


96.03i0.60 


= 



As the GD measure is not defined for attributes with missing values, the 
databases were chosen with few missing values. All the previous ones have less 
than 10% of samples with missing values and these samples were removed from 
the dataset. With respect to the continuous attributes, they were discretized 
with the simple equal width discretization method with 10 intervals. The process 
followed to test the quality of the attributes selected with the GD measure was 
the following. For all datasets, we selected the best attribute subset according 
to each of the three methods being compared. Then we estimate the accuracy 
yielded by each classifier yields using the selected attributes. The accuracy was 
estimated taking the mean of ten runs of a 10 k-fold cross validation [8]. 

To search the subset with minimum value of GD measure, a Sequential For- 
ward Search (SFS) was implemented, adding in each step the attribute that gave 
the lower increase of the measure value. For the ReliefF algorithm we sorted the 
attributes in decreasing order of relevance and took in each case the number of 
attributes we were considering. With the Mantaras distance we did the same 
but sorting the attributes in increasing value of the distance. The best results 
obtained for each classifier are shown in the tables 2, 3 and 4. 

To assess the obtained results, two paired t statistical tests with a confidence 
level of 90% were realized. Under the null hypothesis of the first statistical test, 
the two methods have the same accuracy, which means that accuracydcD — 

accuracy belief F or accuracy ~ (accuracy d^M - hypothesis of this 

statistical test is rejected, and another statistical test is performed in which the 
null hypothesis is that the accuracy of proposed method is lower or equal to 
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Table 4. Results obtained with the IBl classifier 





dcD 


ReliefF 


dhM 




# attr. 


Acc. 


# attr. 


Acc. 


Concl. 


# attr. 


Acc. 


Concl. 


BW 


9 


95.81i0.23 


5 


96.38i0.21 


< 


7 


95.90i0.23 


= 


GL 


6 


76.53i0.89 


6 


76.53i0.89 


= 


9 


68.59i0.99 


> 


G2 


5 


87.85i0.73 


5 


87.85i0.73 


= 


4 


86.68i0.83 


= 


HD 


11 


80.78i0.69 


5 


79.30i0.73 


= 


8 


76.52i0.71 


> 


IR 


4 


95.73i0.50 


4 


95.73i0.50 


= 


4 


95.73i0.50 


= 


LD 


5 


66.01i0.81 


5 


66.01i0.81 


= 


5 


66.01i0.81 


= 


PI 


8 


70.63i0.44 


8 


70.63i0.44 


= 


8 


70.63i0.44 


= 


WI 


11 


97.54i0.37 


11 


96.69i0.41 


> 


9 


98.26i0.32 


< 


PO 


1 


50.08i2.37 


2 


56.52il.69 


< 


1 


54.76i2.74 


< 


TT 


9 


80.77i0.38 


8 


81.98i0.37 


< 


8 


81.94i0.37 


< 


VO 


14 


93.98i0.33 


10 


93.77i0.33 




10 


93.84i0.39 




LE 


7 


64.81i0.66 


7 


64.81i0.66 


= 


7 


64.81i0.66 




P5 


8 


99.92i0.05 


5 


lOO.OiO.OO 


= 


8 


99.92i0.05 




BC 


6 


73.53i0.73 


8 


74.03i0.72 


= 


7 


74.08i0.75 


= 


CR 


11 


83.49i0.43 


9 


82.92i0.40 


= 


1 


82.31il.23 


= 


Ml 


5 


99.79i0.12 


3 


lOO.OiO.OO 


< 


5 


99.79i0.12 


= 


M2 


5 


67.01i0.70 


5 


67.01i0.70 


= 


5 


67.01i0.70 


= 


M3 


5 


99.66i0.15 


2 


96.94i0.28 


> 


5 


99.66i0.15 


= 


ZO 


11 


97.71i0.44 


10 


97.03i0.48 


= 


13 


97.51i0.48 


= 



the another method, which that means that accuracy ^ ciccuracyReUefF or 
accuracy dcD ^ accuracy dLM- The results of these statistical tests appear in the 
tables 2, 3, 4 under the column labeled “Conch”. 

If we consider all the possible results that we get using the three selection 
methods (GD measure, ReliefF and Mantaras distance) and the three classifiers 
(Naive Bayes, ID3 and IBl) with the 19 databases we get 114 results. Now we 
are going to analyze these 114 obtained results to get some conclusions out about 
the performance of the different methods. 

In 9 (7.9%) of the 114 cases, the set of attributes selected by the GD measure 
yields better accuracy than the two other methods. In 90 (78.9%) of the 114 
cases, the set of attributes selected by the GD measure yields a accuracy that 
is equal to the two other methods. Considering the comparative with the two 
methods separately we get that with respect to the ReliefF method in 4 (7%) 
of the cases the set of attributes selected by the GD measure is better than the 
selected by ReliefF and in 43 (75.4%) is equal. On the other hand, with respect 
to the Mantaras distance in 5 (8.8%) of the cases the results obtained by the GD 
measure improve the results of the Mantaras distance, and in 47 (82.4%) cases 
the results are equal. 

Taking into account the nature of the attributes of the selected databases, 
the databases can be grouped in four groups: continuous attributes (BW, GL, 
G2, HD, IR, LD, PI and WI databases), nominal attributes (PO, TT and VO 
databases), boolean attributes (LE and P5 databases) and mixed attributes (BC, 
CR, Ml, M2, M3 and ZO databases). If we compare the results obtained in 
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Table 5. Number of nodes of the tree generated by ID3 





dcD 


ReliefF 


cIlm 




# attr. 


# nodes 


# attr. 


# nodes 


Goncl. 


# attr. 


# nodes 


Goncl. 


BW 


2 


62.16T0.34 


3 


84.66i0.68 


< 


10 


49.18i0.38 


> 


GL 


6 


83.74T0.49 


6 


83.74i0.49 


= 


8 


78.92i0.49 


> 


G2 


5 


39.92T0.26 


3 


43.74i0.34 


< 


6 


34.60i0.38 


> 


HD 


4 


73.22T0.39 


5 


84.66i0.38 


< 


8 


120.20i0.66 


< 


IR 


2 


14.66T0.15 


2 


14.66i0.15 


= 


2 


14.66i0.15 


= 


LD 


4 


173.40i0.92 


4 


173.40i0.92 


= 


5 


156.04i0.83 


> 


PI 


8 


231.66i0.93 


8 


231.66i0.93 


= 


8 


231.66i0.93 


= 


WI 


3 


16.28i0.22 


5 


16.22i0.23 




3 


16.28i0.22 




PO 


4 


131.15il.64 


1 


4.00i0.00 


> 


1 


4.00i0.00 


> 


TT 


9 


327.34i2.38 


9 


327.34i2.38 


= 


9 


327.34i2.38 


= 


VO 


1 


4.00i0.00 


1 


4.00i0.00 


= 


1 


4.00i0.00 


= 


LE 


7 


159.94i0.26 


7 


159.94i0.26 


= 


7 


159.94i0.26 


= 


P5 


8 


321.30i7.07 


5 


63.00i0.00 


> 


8 


321.30i7.07 


= 


BG 


4 


105.33il.21 


1 


4.00i0.00 


> 


4 


37.42i0.07 


> 


GR 


1 


3.00i0.00 


1 


3.00i0.00 


= 


1 


3.00i0.00 


= 


Ml 


5 


69.98i2.07 


3 


41.00i0.00 


> 


5 


69.98i2.07 


= 


M2 


5 


145.90i0.34 


5 


145.90i0.34 


= 


5 


145.90i0.34 


= 


M3 


5 


19.00i0.00 


6 


19.00i0.00 


= 


5 


19.00i0.00 


= 


ZO 


9 


22.54i0.08 


8 


26.62i0.10 


< 


8 


26.62i0.10 


< 



each kind of databases, it can be noticed that GD measure gives better results 
in databases with continuous and mixed types of attributes. However in the 
databases with nominal attributes, the GD measure has a lower performance. 
Finally, the results obtained with the three methods is the same in the databases 
with boolean attributes. 

After the previous global evaluation of the results, we are going to focus 
on two databases: BW and CRX. The BW database has a completely irrele- 
vant attribute that is the identifier of each sample. This attribute has been the 
last selected attribute by ReliefF and the GD measure whereas the Mantaras 
distance selects it firstly. In the CRX database, the attributes A4 and A5 are 
completely correlated and one of them has been removed in some distributions 
of this database. This correlation between attributes A4 and A5 is detected by 
the GD measure and selects the A5 attribute in last position, however ReliefF 
and the Mantaras distance do not take into account the correlation between the 
attributes and they select both of them before other attributes. 

In table 3, it can be noticed that in general the attribute selection methods 
do not improve significantly the performance of the induced decision tree as it 
has been mentioned by other authors. On the contrary, the main advantage of 
attribute selection methods is the reduction of the reduction of the induced tree 
because the presence of irrelevant or redundant attributes increase the size of the 
tree. The GD measure seems to follow the same trend that the other methods 
in reference to the accuracy, but the size of the tree is reduced in certain cases. 
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In the table 5 the number of the nodes of the induced trees is presented along 
with the results of two statistical tests similar to the used with the accuracy 
but now making reference to the number of nodes. If we focus on the databases 
of the table 3 whose accuracies are not statistically different and we compare 
the number of nodes of the trees, we observe that in 4 (11.8%) cases the trees 
induced with the attribute set obtained using the GD measure has less nodes 
than the obtained with the attributes selected by the other two methods, and 
in 22 (64.7%) the number of nodes is equal. 

6 Conclusions 

In this paper, a measure, called GD measure, for attribute selection based on 
Information Theory have been presented. Unlike other measures of based on 
Information Theory, the GD measure does not only select the most relevant 
attributes but also takes into account the interdependencies between attributes 
to detect redundant attributes. 

From the comparative study carries out, we can conclude that the results 
obtained with the GD measure and ReliefF method are very similar with respect 
to the accuracy, although the dimensionality of the attribute sets selected with 
the GD measure have a slightly lower dimensionality than the selected with the 
Relief method. 

On the other hand, the use of the GD measure for feature selection seems 
to improve the results obtained with the Mantaras distance and with fewer 
attributes. This can be due to the introduction of the transinformation matrix 
that detect the dependencies between attributes. It is important to point out 
that GD measure works well in problems where the attributes are continuous or 
where there are several different types of attributes. 

Finally, the GD measure does not seem to exhibit a noticeable bias in favor 
of attributes with large number of values like other measures based on Informa- 
tion Theory have. This fact was probed empirically following the experiments 
proposed by Liu. 
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Abstract. It is widely reported in the literature that incremental clus- 
tering systems suffer from instance ordering effects and that under some 
orderings, extremely poor clusterings may be obtained. In this paper 
we present a new general strategy aimed to mitigate these effects, the 
Not- Yet strategy which has a general and open formulation and it is not 
coupled to any particular system. Unlike other proposals, this strategy 
maintains the incremental nature of learning process. In addition, we pro- 
pose a classihcation of strategies to avoid ordering effects which clarifies 
the benefits and disadvantages we can expect from the proposal made in 
the paper as well from existing ones. A particular implementation of the 
Not- Yet strategy is used to conduct several experiments. Results suggest 
that the strategy improves the clustering quality. We also show that, 
when combined with other local strategies, the Not- Yet strategy allows 
the clustering system to get high quality clusterings. 

Keywords: Machine Learning, Data mining. Incremental clustering. Or- 
der effects. 



1 Introduction 

Ideally, intelligent agents should possess the ability of adapting their behavior 
to the environment over time through learning. Thus, learning methods should 
be able of updating a knowledge base in a continual basis as new experience is 
gained. Particularly, if an agent performing a clustering task [6] should be able of 
using its learned knowledge to carry out some performance task at any stage of 
learning, the conceptual scheme should evolve as every new instance is observed 
without simultaneously processing previous instances. This sort of clustering is 
often referred to as incremental clustering. As noted by Langley [9], there can be 
several interpretations of incremental learning. In the remainder of this paper, 

Helder Coelho (Ed.): IBERAMIA’98, LNAI 1484, pp. 136-147, 1998. 

© Springer- Verlag Berlin Heidelberg 1998 



Robust Incremental Clustering with Bad Instance Orderings 137 



we will assume that a clustering method is incremental if inputs one instance at 
a time, does not reprocess previous instances and maintains a single conceptual 
structure in memory. 

Incremental clustering, as defined above, has to rely on some sort of hill 
climbing strategy which triggers small modifications of the knowledge base as 
new instances are observed. This way of incorporating single instances into the 
cluster structure makes incremental systems to be sensitive to instance order, as 
widely reported in the clustering literature [2,5,7,8,9,10]. 

We say that incremental clustering algorithms exhibit ordering effects when 
they may yield different cluster structures when the same instances are presented 
in different orders. In some cases, they even can produce very poor quality clus- 
terings. The problem lies in that a hill climbing strategy may narrow too much 
the search through the clustering space in a manner that initial observations 
may lead to a clustering scheme which does not reflect the real structure in the 
domain. In the worst case, the system might never be able of reaching a good 
clustering despite of gaining new experience. 

2 Avoiding Ordering Effects 

Research in incremental clustering has approached the ordering effects problem 
by using several strategies. In this section, we will give a classification of strate- 
gies to avoid ordering effects with regard to two different dimensions, namely, 
the stage in the clustering process in which they are applied and the scope. Our 
aim is to clarify the potential benefits and limitations that a given strategy can 
provide and also, to provide a general framework to place in our own research. 

If we divide the strategies according to the stage in the clustering process 
in which they are applied, we can distinguish among three application points: 
before, during and after clustering. Methods which are applied before elustering 
can only be used when all or a great amount of data is known beforehand. 
The idea is to arrange the instance order in such a way that favors the system 
search process to reach the best classification. It is seen that when dissimilar 
objects are consecutively presented, the resulting classification is much better 
than when similar objects are presented together [5,7]. This occurs because, in 
the former case, initial observations are from different areas of the description 
space leading initial clusters to reflect these areas, while in the later, a skewed 
cluster structure may evolve. Thereafter, the clustering system may not be able 
to recover when further instances from other parts of the description space are 
observed. A typical example of preprocessing are seed seleetion methods which 
select ’seed’ observations from data growing clusters around them [2,11]. 

When constructing a cluster structure in an incremental fashion, only two 
basic operators are needed, one to ereate a new cluster given an instance and 
another to ineorporate an instance to an existing cluster. Theoretically, using 
these two operators, any clustering structure could be built. However, once an 
object is consolidated into the structure, the clustering system cannot move it 
using these two basic operators, therefore the system cannot easily recover from 
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GLOBAL 




BEFORE CLUSTERING 



seed selection 



LOCAL - 



AFTER CLUSTERING ^ iterative optimization 

DURING CLUSTERING — ^ clustering operators 



Fig. 1. Classification of strategies to avoid order effects and some examples 



previously taken bad decisions when further experience is gained. The clustering 
system may be provided with additional operators to be applied during clustering 
in order to be able of recovering from bad instance orders. These operators can 
be viewed as providing some sort of backtracking capabilities to the system 
without having a memory of previous knowledge structures. A classical example 
of clustering operators are the merge and split operators of COBWEB [4]. 

Finally, we can tackle with ordering effects using after clustering strategies. 
These strategies are intended to act upon a previously obtained clustering in 
order to improve it. Iterative optimization algorithms are well suited to this 
approach. Usually, they exploit the gained clustering to redistribute observa- 
tions or clusters and lead to an improved clustering according to some objective 
function. After clustering strategies rely on a continuous reprocessing of the 
clustering structure so they violate the requirements of incremental learning. 

^From another point of view, we can distinguish strategies according to the 
scope of their application and effects. For a global strategy we mean a method 
which uses information about the whole domain and, therefore, needs to know a 
significant amount of data in advance, possibly including an extensive reprocess- 
ing of instances. In contrast, a local strategy acts upon a small piece of knowledge 
assuming that small, local changes will contribute to improve global clustering 
quality. Usually, this sort of strategies will be triggered only by new observa- 
tions. Figure 1 shows the classification of strategies discussed so far. Clearly, 
global strategies are expected to give significantly better results than local ones 
since we cannot guarantee local changes to have a sufficiently strong effect upon 
the global knowledge structure. However, global strategies may be undesirable 
under the incremental learning assumption because they extensively reprocess 
the instances in the dataset. 



3 The Not-Yet Strategy 

Since our goal is to solve the instance ordering problem while maintaining the 
incremental nature of clustering systems, we propose a solution to be applied 
during the clustering process. This is a local strategy and so implies a trade-off 
between the degree of clustering quality improvement and the preservation of 
the incremental properties of a system. 
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Let I be an instance 
Let P be a partition 

Let E be the expected utility/confidence of adding / to P 
Let a be the Not- Yet threshold value 
E >a then add(P,/) 
else add_NY_buffer(/) 

endif 



Table 1. Not- Yet control strategy 



Our solution is based on a simple and intuitive idea. We refer to it as the 
Not- Yet strategy and it has a general and open definition. The strategy states 
that the incorporation of instances will be deferred if they are in either one of the 
following two cases, a) it is not expected the utility of the resulting clustering, 
after incorporating the instance, to be improved, and b) there is not confidence 
enough about how the instance should be included in the existing clustering. The 
Not- Yet strategy assumes the existence of a buffer which stores instances that 
have not been incorporated into the clustering. In order to apply the strategy 
to a particular clustering system, three issues need to be specified. Firstly, we 
need some measure to decide whether an instance should be included in the 
buffer. Secondly, a criterion to determine how buffered instances are reprocessed 
is also needed. And finally, we can consider how to order the instances into 
the buffer. Since our aim is to propose a general enough framework to fit into 
several approaches we will not specify any particular solution for these questions 
at this point. In the experiments, however, we will propose an example of how 
the Not- Yet strategy could be effectively implemented. 

In Table 1 an algorithmic formulation of the Not- Yet control strategy is 
shown. We assume the existence of an add operator in the original clustering 
algorithm which given a partition and an instance incorporates the instance 
to the partition. This operator is embeded into a new conditional estructure 
containing the a threshold, which constraints the amount of utility or confidence 
required for an instance in order to be incorporated into a clustering. It is worthy 
to note that if we assume the E value to be always positive, when a is 0, the Not- 
Yet control strategy simply reduces to the original clustering algorithm, which 
becomes a particular case of a more general strategy. This fact demonstrates the 
generality of the strategy proposed. 

Figure 2 shows a typical running of a clustering system on two extreme cases 
of instance ordering. The graph shows the evolution of the value of a clustering 
quality function with every new observed instance. When instance ordering is 
good, the graph reveals that high quality clusters are initially constructed be- 
cause instances cover very different clusters underlying the domain which are 
easy to discriminate by the system. Later, the system clustering gracefully con- 
verge to the quality global maximum. This maximum is below the initial obtained 
scores since additional instances may present a more uncertain cluster member- 
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Fig. 2. An example of the Not- Yet strategy effect on clustering quality 



ship, so that initial confidence is reduced. Bad orderings have the opposed effect. 
Low quality clusterings are created at the beginning which strongly condition 
the rest of the process. We added a third curve reflecting the evolution of the 
clustering quality when using the Not- Yet strategy with a bad ordering. At the 
beginning, the system behaves in the same manner that the original algorithm, 
but as it buffers some instances -around the fifth instance- the cluster quality 
increases. When quality reaches its maximum, it converges with the good or- 
dering curve. The graph clearly shows how our strategy tries to reach the same 
’good learning path’ starting from a different learning point. Note, that the hor- 
izontal axis measures the number of instances effectively incorporated into the 
cluster structure. So, when using the Not- Yet strategy, a quality maximum is 
reached around the 25th instance but, at this point, other instances may have 
been observed and buffered as well. In fact, in the particular case of the graph 
shown, only around 25 instances passed the Not- Yet ’filter’ the first time they 
were observed, the rest of them being buffered. When incorporating buffered 
instances a decrease of the clustering quality is observed but this behavior is 
similar to that of the system with a good ordering. It is worth to notice that the 
discussed graph shows a selected example of an optimal behavior of the Not- Yet 
strategy for illustration purposes. 

Complexity when using our strategy will vary from system to system de- 
pending on the cost of effectively incorporating an instance and computing the 
expected utility or confidence of adding the instance. However, most clustering 
systems use some quality function to decide the best choice when an instance 
is observed, so it is likely that this function is a good candidate to measure the 
amount of utility /confidence. Also, we can assume the buffer reordering to be 
random and hence, of linear cost. If this is the case, complexity is augmented by 
a constant factor. This factor is dependent on the times every instance is con- 
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Let I be an instance and P be a partition 
Let Ml, M2 be the best and second best CQF 
Let a be the Not- Yet threshold value 
if (1 - M2 1 Ml) > a then add(P,/) 
else add_NY_buffer(/) 
endif 

Table 2. Implementation of the Not- Yet control strategy for the experiments. 



sidered for incorporation into the cluster structure. Obviously, given the open 
formulation of the Not- Yet strategy, more complex criteria might be used, be- 
coming computational complexity harder. 

4 Experiments 

In order to empirically evaluate the Not- Yet strategy we conducted several ex- 
periments using four well-known datasets of the UCI repository [12]. Since the 
clustering task is an unsupervised learning task, we have treated labels just as 
another attribute. In the experiments we assume a general model of hierarchical 
incremental clustering using two basic operators, one for creating a new class 
and another to incorporate an instance to an existing class. A concept hierarchy 
grows incrementally as new instances are observed after applying one of these 
operators according to the value of some cluster quality function (CQF). This 
is a typical model of incremental clustering using a hill climbing strategy which 
estimates the goodness of applying the available operators and chooses the best 
option, without reconsider any decision made. Particularly, this model corre- 
sponds to the one used in the COBWEB system [4]. The measure of category 
utility used in this system is also used in the experiments as the CQF. We used 
a COBWEB-like clustering strategy because it is simple, well-known and it has 
been applied (or augmented) in several learning systems [1,8]. 

In addition, we considered an augmented version of this basic procedure 
by adding the merge and split operators used in COBWEB. Briefly, the merge 
operator modifies a hierarchy by combining two existing clusters while the split 
operator breaks existing clusters into smaller ones. Split and merge operators 
provide a sort of backtracking to the clustering system. Since these operators 
constitute a well known strategy, they may be taken as a basis for evaluating 
the Not- Yet results. Moreover, as both strategies represent a different approach, 
it is possible to investigate a combined approach as we will discuss later. 

As stated before, we embed the basic control procedure into another one im- 
plementing the Not- Yet strategy. The strategy does not incorporate an instance 
to a cluster structure if there is not evidence enough to decide between the 
available operators. As shown in Table 2, for each instance, a ratio between the 
second best CQF and the best one is computed. We consider that an operator 
does not yield a significant better clustering than others if the confidence is below 
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the a threshold, which is in the [0,1] range. Also, as we mentioned before, we can 
select some method to reorder the instances in the buffer prior to reprocessing. 
Instead of relying on simply maintaining the order of insertion in the buffer, we 
chose to randomize the instances. 

We tested our strategy with the basic and the augmented clustering models, 
under two different experimental conditions. In the first experiment, the criterion 
to reprocess instances was simply to flush the buffer at the end of the cluster- 
ing process. Therefore, an unlimited size of the buffer was assumed. Actually, 
in practice, the size was limited by the total number of instances of each tested 
dataset. As we will discuss later, this criterion may appear to be counterintuitive 
under an strict incremental learning environment. So, a second type of experi- 
ments were conducted limiting the buffer size, with instance reprocessing being 
fired as soon as the buffer becomes full. 

4.1 Experiments with Unlimited Buffer 

Experiments were performed on both random and worst case orderings. Table 3 
shows the results obtained with both orderings using several values of the a pa- 
rameter for the Not- Yet strategy. The zero value for this parameter corresponds 
to the original algorithm without buffering any instance. These experiments as- 
sume that the main goal of clustering is to discover a top level partitioning of 
data of high quality and so we give the CQF scores of the first level. 

Results demonstrate that instance ordering has a critical effect in cluster 
quality. When bad orderings are presented, the quality of discovered clusterings 
significantly drops compared to the results obtained with random orderings. 
Results from the augmented version of the algorithm show the limitations of 
the plain version even with random orderings. Moreover, they give an idea of 
the CQF values that we could reasonably expect to obtain with each dataset. 
Results suggest that although the additional operators improve the behavior of 
the system, their impact is still limited. This is possibly due to the fact that 
they constitute a local strategy and are only triggered by new observations. 

The Not- Yet strategy does not appear to change the behavior of the clustering 
procedure in results with random orderings, neither with the basic nor with the 
augmented algorithms. With bad orderings, the strategy improves results with 
both algorithms. However, in the case of the basic procedure the CQF values 
are still lower than those obtained with random orderings. We could expect to 
match the random ordering results by using the Not-Yet strategy since the buffer 
is randomized and a large number of instances is buffered in many cases. The 
difference in the results seems to suggest that only few instances may strongly 
influence the rest of the clustering process. This problem is overcome by using 
the additional operators and the Not-Yet strategy simultaneously, which results 
in high quality clusterings. This result can be explained by the fact that the 
Not-Yet strategy and the clustering operators are complementary approaches. 
While the former consistently detects bad orderings and rearranges the instance 
ordering to solve the problem, the later allows to modify the hierarchy at the 
cluster level. So, although operators may theoretically deeply change the cluster 
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Basic 


Augmented 


CQF 


Buffered inst. 


CQF 


Buffered inst. 




a 


Rand. 


Worst 


Rand. 


Worst 


Rand. 


Worst 


Rand. 


Worst 




0.0 


1.49 (0.14) 


0.91 (0.11) 


0.00 


0.00 


1.62 (0.01) 


1.12 (0.19) 


0.00 


0.00 




0.05 


1.52 (0.12) 


0.99 (0.13) 


0.03 


0.46 


1.61 (0.03) 


1.42 (0.17) 


0.21 


0.37 


soyb. 


0.10 


1.49 (0.15) 


1.02 (0.17) 


0.11 


0.80 


1.60 (0.05) 


1.52 (0.10) 


0.52 


0.80 


small 


0.15 


1.48 (0.15) 


1.04 (0.12) 


0.40 


0.88 


1.60 (0.04) 


1.58 (0.05) 


0.69 


0.89 




0.20 


1.46 (0.12) 


1.10 (0.11) 


0.81 


0.90 


1.61 (0.04) 


1.58 (0.06) 


0.83 


0.90 




0.25 


1.48 (0.13) 


1.12 (0.11) 


0.93 


0.91 


1.61 (0.03) 


1.59 (0.05) 


0.93 


0.93 




0.0 


1.00 (0.10) 


0.63 (0.14) 


0.00 


0.00 


1.17 (0.02) 


0.95 (0.14) 


0.00 


0.00 




0.05 


1.03 (0.09) 


0.65 (0.14) 


0.02 


0.06 


1.16 (0.03) 


1.06 (0.11) 


0.22 


0.40 


soyb. 


0.10 


1.05 (0.10) 


0.72 (0.10) 


0.09 


0.81 


1.16 (0.03) 


1.13 (0.07) 


0.52 


0.74 


large 


0.15 


1.07 (0.10) 


0.73 (0.08) 


0.31 


0.96 


1.16 (0.03) 


1.16 (0.03) 


0.82 


0.96 




0.20 


1.05 (0.11) 


0.76 (0.08) 


0.75 


0.98 


1.17 (0.02) 


1.16 (0.03) 


0.90 


0.98 




0.25 


1.01 (0.11) 


0.77 (0.09) 


0.92 


0.99 


1.16 (0.03) 


1.16 (0.03) 


0.96 


0.99 




0.0 


1.29 (0.28) 


0.85 (0.19) 


0.00 


0.00 


1.61 (0.00) 


1.43 (0.12) 


0.00 


0.00 




0.05 


1.33 (0.26) 


0.82 (0.17) 


0.00 


0.01 


1.61 (0.00) 


1.50 (0.11) 


0.01 


0.30 


house 


0.10 


1.35 (0.25) 


0.84 (0.15) 


0.01 


0.09 


1.60 (0.05) 


1.53 (0.10) 


0.06 


0.48 




0.15 


1.34 (0.25) 


0.83 (0.16) 


0.08 


0.72 


1.60 (0.04) 


1.58 (0.07) 


0.17 


0.80 




0.20 


1.37 (0.25) 


0.83 (0.13) 


0.29 


0.98 


1.61 (0.01) 


1.60 (0.01) 


0.38 


0.93 




0.25 


1.35 (0.27) 


0.85 (0.13) 


0.64 


0.98 


1.61 (0.00) 


1.60 (0.01) 


0.66 


0.99 




0.0 


1.05 (0.12) 


0.67 (0.17) 


0.00 


0.00 


1.17 (0.03) 


0.95 (0.14) 


0.00 


0.00 




0.05 


1.06 (0.12) 


0.67 (0.14) 


0.02 


0.10 


1.17 (0.03) 


1.05 (0.10) 


0.13 


0.39 


zoo 


0.10 


1.05 (0.13) 


0.73 (0.11) 


0.06 


0.54 


1.17 (0.03) 


1.10 (0.08) 


0.30 


0.67 




0.15 


1.03 (0.15) 


0.77 (0.12) 


0.19 


0.87 


1.16 (0.03) 


1.15 (0.05) 


0.53 


0.87 




0.20 


1.00 (0.17) 


0.76 (0.08) 


0.45 


0.92 


1.17 (0.03) 


1.16 (0.03) 


0.70 


0.93 




0.25 


1.03 (0.17) 


0.78 (0.09) 


0.78 


0.93 


1.16 (0.03) 


1.16 (0.03) 


0.83 


0.94 




0.00 


1.11 (0.22) 


0.52 (0.13) 


0.00 


0.00 


1.38 (0.00) 


0.81 (0.24) 


0.00 


0.00 




0.05 


1.16 (0.21) 


0.53 (0.16) 


0.00 


0.01 


1.38 (0.00) 


1.18 (0.19) 


0.03 


0.51 


mush 


0.10 


1.17 (0.20) 


0.52 (0.16) 


0.01 


0.10 


1.38 (0.00) 


1.29 (0.11) 


0.18 


0.67 




0.15 


1.17 (0.23) 


0.65 (0.08) 


0.07 


0.99 


1.38 (0.00) 


1.38 (0.01) 


0.46 


0.98 




0.20 


1.16 (0.22) 


0.71 (0.06) 


0.35 


0.99 


1.38 (0.00) 


1.38 (0.00) 


0.81 


0.99 




0.25 


1.15 (0.19) 


0.73 (0.06) 


0.77 


0.99 


1.38 (0.00) 


1.38 (0.00) 


0.90 


0.99 



Table 3. Clustering results. Averages and standard deviations over 50 trials 



structure, they are not triggered by ordering conditions. On the other hand, 
the Not- Yet strategy has not the ability to directly modify the cluster structure 
despite of correctly detecting bad orderings. 

Table 3 shows that the number of buffered instances increases as the a value 
does, and also that this increment is faster with bad orderings. This demonstrates 
the ability of the Not- Yet strategy for detecting bad instance orders. In addition, 
clustering quality tends to improve as we use higher a values. This result could 
be expected given that, in this case, the number of instances being reordered is 
larger. Despite of this, an a value around 0.20 seems to perform reasonably well 
in all tested datasets. 
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OL 


Buffer 




0.10 


0.20 


0.30 


0.35 


0.40 


0.45 


100 


CQF 


1.11 (0.21) 


1.12 (0.20) 


1.23 (0.17) 


1.22 (0.16) 


1.23 (0.15) 


1.23 (0.16) 


Buff.ins. 


0.32 


0.64 


0.74 


0.97 


0.99 


0.99 


200 


CQF 


1.11 (0.23) 


1.21 (0.22) 


1.26 (0.15) 


1.31 (0.08) 


1.31 (0.08) 


1.32 (0.07) 


Buff.ins. 


0.35 


0.69 


0.82 


0.99 


0.99 


0.99 


300 


CQF 


1.20 (0.21) 


1.22 (0.21) 


1.28 (0.15) 


1.32 (0.11) 


1.32 (0.11) 


1.32 (0.11) 


Buff.ins. 


0.44 


0.64 


0.76 


0.99 


0.99 


0.99 


400 


CQF 


1.25 (0.14) 


1.27 (0.14) 


1.27 (0.13) 


1.31 (0.07) 


1.31 (0.08) 


1.31 (0.08) 


Buff.ins. 


0.51 


0.85 


0.91 


0.99 


0.99 


0.99 


500 


CQF 


1.26 (0.14) 


1.24 (0.14) 


1.26 (0.12) 


1.24 (0.16) 


1.24 (0.16) 


1.23 (0.16) 


Buff.ins. 


0.50 


0.93 


0.98 


0.99 


0.99 


0.99 


600 


CQF 


1.29 (0.12) 


1.36 (0.09) 


1.36 (0.09) 


1.37 (0.02) 


1.37 (0.02) 


1.37 (0.02) 


Buff.ins. 


0.56 


0.65 


0.81 


0.99 


0.99 


0.99 


700 


CQF 


1.29 (0.11) 


1.38 (0.01) 


1.37 (0.03) 


1.38 (0.01) 


1.38 (0.00) 


1.37 (0.04) 


Buff.ins. 


0.62 


0.83 


0.93 


0.99 


0.99 


0.99 


800 


CQF 


1.30 (0.11) 


1.38 (0.00) 


1.38 (0.00) 


1.38 (0.00) 


1.38 (0.00) 


1.38 (0.00) 


Buff.ins. 


0.64 


0.89 


0.99 


0.99 


0.99 


0.99 


900 


CQF 


1.30 (0.12) 


1.38 (0.00) 


1.38 (0.00) 


1.38 (0.00) 


1.38 (0.00) 


1.38 (0.00) 


Buff.ins. 


0.63 


0.97 


0.98 


0.99 


0.99 


0.99 



Table 4. Clustering results for the augmented algorithm with worst orders and 
different buffer sizes. Averages and standard deviations over 50 trials. 



4.2 Experiments with Limited Buffer 

In the previous experiments, the most important improvements are obtained at 
the expense of maintaining a big buffer, i.e., with high a values. It may appear 
counterintuitive with the idea of incremental learning to maintain a buffer of 
more than 90% of the instances in the dataset and ffush the buffer at the end 
of the process. Strictly speaking, in an incremental learning environment, the 
system never achieves a final state so it is not clear in which moment the buffer 
has to be hushed. A natural solution is to limit the Not- Yet buffer in a way that 
it would be hushed several times during learning, without assuming any final 
state in the learning process. 

Table 4 shows results from the mushroom dataset with worst orderings and 
different buffer sizes. We chose this dataset because it has a large number of 
instances and it is expected to give more relevant results. In these experi- 
ments instance reprocessing is fired as soon as the buffer is full. Comparing 
these results with the ones from Table 3, it can be observed that buffer sizes in 
the 600-900 range give similar results to the ones obtained with unlimited buffer 
size and same a values. Reducing the buffer size, for ol values around 0.20, that 
gave optimal results in the previous experiments, we also got a reduction of the 
CQF scores. However, it is interesting to note that for these smaller buffer sizes. 
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higher CQF values can be achieved increasing the a threshold. Probably, this 
occurs because flushing the -randomized- buffer during the learning process, re- 
sults in a partial cluster structure of reasonable quality. So, when new instances 
are considered, the a value does not constrain enough their incorporation into 
the clustering. These claims are supported by the data in Table 4, where it is 
observed that smaller buffer sizes tend to reduce the number of stored instances 
compared to larger buffer sizes for the same a values. 

5 Related Work 

Several works have approached the ordering problem in incremental clustering, 
although this research has mainly benefited from two particular approaches. 
Lebowitz first introduced the idea of deferred commitment within the frame- 
work of his UNIMEM conceptual clustering system [10]. Our proposal extends 
Lebowitz ’s work by decoupling the buffering strategy from any particular sys- 
tem. Also, we have introduced the a parameter, that allows to see the original 
algorithm as a particular case of the new control strategy. We think that this for- 
mulation should help in applying the strategy to any existing algorithm without 
any major changes. 

The second related work (from which the Not- Yet name is borrowed) is the 
application of this strategy to the LINNEO+ clustering system [2,13]. This work 
contains the basic ideas proposed here, but again the application is tuned for 
an specific system and the problems studied are deeply related to a particular 
clustering strategy. 

Although devoted to global methods, we have to mention relevant Eisher’s 
work on iterative optimization of clusterings [5]. This work explores several meth- 
ods for iteratively improving clustering quality, showing that among these meth- 
ods some exhibit an optimum performance. But recall that these methods often 
operate reprocessing the whole dataset and violate the constraints stated for 
incremental clustering. This sort of strategies are useful from the viewpoint of a 
data analysis task in which the entire dataset is available in advance so that we 
are not limited by strict incremental constraints. 

6 Conclusions and Future Work 

We have presented a classification of strategies to avoid ordering effects in clus- 
tering with regard to two related dimensions, namely, the point of the clustering 
process in which they are applied and the scope of the strategy. This classifi- 
cation aims to clarify the benefits and disadvantages we can expect from the 
application of existing or newly proposed strategies. 

A new local strategy has been proposed to deal with ordering effects. We 
think that the formulation of the strategy is simple and open in the sense that 
it is not coupled with any particular evaluation function or algorithm. As a lo- 
cal strategy, it has a limited impact over the entire conceptual structure as the 
experiments have shown. It is difficult to assess the quality of this improvement 
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beyond the simple quantitative analysis in terms of the CQF. For some applica- 
tions it can suppose an important improvement in terms of understandability or 
performance while for other it may be imperceptible. On the other hand, when 
coupled with another local strategy such as the merge/split operators, the Not- 
Yet strategy allows the clustering system to reach an optimum quality clustering 
even with worst orderings. 

We have noted that the most significant benefits are obtained at the expense 
of maintaining a large Not- Yet buffer. Since an incremental system has to be able 
of using the acquired knowledge for some performance task at any learning stage, 
we have to assume that the system has also to be able of quickly reincorporate 
the buffered instances before actuating. Our experiments demonstrate that it is 
possible to use the Not-Yet strategy and maintain the strict incremental nature 
of the clustering process by means of using limited buffer sizes. In practice, buffer 
size will be limited by the amount of instances that the system can manage in 
a reasonable amount of time before entering in ’performance mode’. This time 
will be dependent on the particular application. 

It is unclear how the proposed procedure scales up to large datasets such as 
those typically referred to in data mining tasks [3]. However, we think that the 
Not-Yet strategy may be an inexpensive and effective way of avoiding ordering 
effects since it is unlikely that a whole large dataset would present a bad order. 
Rather, it probably will have bad ordered subsets, so that a large enough Not- 
Yet buffer will be able to deal with the problem. Note that the size which could 
be considered large for the buffer in the experiments, may be simply a small 
part of a very large dataset of thousands of instances. We plan to explore these 
issues in future work. 

Finally, it is worth to remark that the experiments conducted used a relatively 
simple implementation of the Not-Yet strategy. Extensions studying the order 
of instances in the buffer, the criterion to reprocess instances or the number of 
times instances may be reprocessed appear to be promising topics for further 
research. 
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Abstract. Decision problems can be usually solved using systems that 
implement different paradigms. These systems may be integrated into 
a single distributed system, with the expectation of obtaining a group 
performance more satisfactory than individual performances. Such a dis- 
tributed system is what we call a Multi Agent Decision System 
(MADES), a special kind of Multi Agent System, that integrates several 
heterogeneous autonomous decision systems (agents). A MADES must 
produce a single solution proposal for the problem instance it faces, de- 
spite the fact that its decision making is distributed, and every agent 
produces solution proposals according to its local view and to its id- 
iosyncrasy. We present a distributed reinforcement algorithm for learning 
how to combine the decisions the agents make in a distributed way, into 
a single group decision (solution proposal). 

Topics: Multi Agent Systems, Machine Learning, Distributed Artificial 
Intelligence 



1 Introduction 

Two kinds of learning techniques are usually found in Multi Agent Systems 
(MAS): local learning and distributed learning [4]. Local learning is carried out 
by an agent on its own, without the need of the participation of other agents. 
This implies that the only available view of the world comes from the agent’s 
standpoint. Distributed Learning is carried out as a result of the joint action 
of several agents of the MAS. This means that distributed learning can not be 
accomplished as a result of the isolated action of one agent. In distributed learning 
tasks, agents may need to observe other agents, or use knowledge facilitated 
by other agents [5]. In these situations, the agent’s view of the world may be 
qualitatively different from the view of the world that the agent perceives in 
local learning. Every agent observes the world from a different standpoint, and 
interprets this view according to its own insight. The concurrence of the agents 
individual actions gives rise to a group behaviour. Learning appropriate group 
behaviours is the most common distributed learning task (e.g. [2], [4]). 

Some complex decision problems can be solved by several different mono- 
lithic decision systems, based on different machine learning or problem solving 

Helder Coelho (Ed.): IBERAMIA’98, LNAI 1484, pp. 148-159, 1998. 
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paradigms. When the quality of the results obtained separately by these sys- 
tems is not satisfactory, they can be united in a Multi Agent DEcision System 
(MADES)^ with the expectation of obtaining a joint performance superior to 
the performance obtained using the monolithic systems in an isolated way. A 
MADES is a Multi Agent System built for decision making, where a single de- 
cision is output by the system as a group, although internally many decisions 
may be made locally by the component agents [1]. These decisions may be in- 
compatible, and even contradictory. But in spite of this, a single decision has 
to be made by the system to solve the problem at hand. How to incorporate 
these individual local decisions, into a single group decision, poses a group be- 
haviour learning problem for the MADES. In this work, we propose a distributed 
reinforcement learning algorithm to solve this problem. A simplified version of 
this algorithm was implemented, tested, and experimental results reported in [1]. 
Here we present the formalization of the full algorithm, and discuss the full range 
of its potential. We also formalize the concept of Multi Agent Decision System 
(MADES). 

Section 2 summarizes the I AO architecture^. Section 3 presents an ordered 
stepwise procedure for the solution of problems using the lAO distributed re- 
inforcement learning algorithm in MADES. Section 4 discusses the two ways 
in which this algorithm can influence behaviors in MADES. Sections 5 and 6 
reproduce some results obtained with a simplified and restricted version of the 
algorithm presented here. We conclude with section 7 where we discuss impor- 
tant topics. 



2 The Intelligent Agents Organization 

The Intelligent Agents Organization (I AO) is a Multi Agent Decision System 
architecture aimed at solving complex decision problems [1]. Eigure 1 shows a 
high level view of this architecture: 

— One agent, known as the referee, is in charge of the overall system control. 
It broadcasts problem instance descriptions (service requests), and control 
signals to the rest of the team. It then receives the respective replies from the 
rest of the agents. These replies may be either advice, or problem solving 
proposals. The relationship among the referee and the rest of the agents 
can be regarded as a client-server relationship. The services the referee may 
request to an agent are either the solution proposal synthesis (only to worker 
agents), or an advice request (only to advisor agents). These service requests 
are scheduled in a way that maximizes parallelism (every agent runs on a 
different machine), so the MAS response time is minimized. 

— A worker agent receives problem descriptions from either the referee, or 
another worker, and replies with solution proposals. Worker agents work 
in parallel on a solution proposal to the same problem instance, and are 
capable of autonomous decision making. Any of them could be the basis 



^ a more complete description of it can be found in [1] 
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Fig. 1. A schematic view of the lAO Architecture. 



of a monolithic system aimed to solve each problem. The only restriction 
imposed on workers is that they should adhere to the client-server protocol. 
Other than that, there is a total freedom for the system designer to use any 
agent he/she wishes as a worker. Due to this flexibility, the lAO model is 
widely applicable. 

— Several agents play the role of advisors. They are contacted by, either the 
referee or a worker agent, and receive a problem instance as input. They reply 
to the requester with the identification of the worker agent that they expect 
to be the most competent of the group, in the solution of the aforementioned 
problem instance. The referee will use this advice, when one of the proposals 
the worker agents provide has to be selected. A worker agent may use this 
advice to ask the worker, that the advisor indicated, for its cooperation in 
the solution of the problem instance. 

— A trainer agent produces problem instances that are used for training and 
testing. Problem generation can be made either randomly, or using an “ad 
hoc” scheme. The criteria for problem synthesis affects the success of the 
learning effort, as it is widely known. 



3 The lAO Distributed Reinforcement Learning 
Algorithm 



The purpose of this algorithm is to enable the agents to improve their collective 
behaviour themselves. It is applicable to any MADES that follow the role differ- 
entiation specified by the I AO model. Next, we describe the steps to be followed 
for problem solving with the lAO model. 
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3.1 Problem Modeling 

The domain is modeled as a problem space: 

World={S^,A) 



where 

— Su) is the state space, whose representation depends on the problem at hand. 

— ^ is the action set, whose elements are all the possible actions that can be 
executed on states. 

A Multi Agent Decision System is modeled as 

MADES = {Sm 5 h 5 Aj.^ A^, g) 



where 

— Sm is the set of states of the world as the MADES perceives them. 

— i is the input function, that translates from the external to the internal state 
representation. 



i : Sw Sm 

In decision problems, i is usually the unit function, so Sm is usually equal 
to Sw' 

— Am is the set of actions that the MADES can execute on the world, which 
is usually equal to A. 

— A^ is a single element set formed by an autonomous agent that controls the 
ordered execution of the system’s decision making algorithm, and is known 
as the referee. 

— A^ is the set of workers, autonomous agents that receive problem descrip- 
tions and reply with a proposal of the decision to be made to solve the 
problem. We use w to denote the number of workers in the MADES. 

— Aa is the set of advisors, autonomous agents whose mission is to observe the 
worker agents, to learn their competencies, and to inform other agents about 
their findings. 

— At is the set of trainer agents, whose mission is to sinthesize problem in- 
stances for the MADES to train on. 

— is the group behaviour, a control policy that specifies how to produce a 
single group decision from the agents decision proposals A^, 

g : Sm x A^ ^ Am 
An autonomous agent is modeled as 

agent = {Sg, Ag, ig, Cg, S' g, Si, b) 



where 

— Sg is the set of the possible inputs the agent may receive. 
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— Ag is the set of the actions the agent can execute. In MADES, the typical 
actions are decision making and adaption. 

— ig is the agent’s input function, 

ig : Sm Sg 

— Cg is the set of the communication’s acts that the agent can perform. In 
MADES, they are usually either the request of a decision making service to 
another agent (i.e. a consultation), or the reply to a decision request with 
the results. 

— S' g is the set of all the data the agent receives from another agents (e.g. a 
classification petition) . 

— Si is the set of the agent’s internal states. 

— b is the agent’s behaviour function, 

b: SgX S' gX Si ^ Ag 



3.2 Partitioning of the State Space 

If an advisor based on reinforcement learning is used for the estimation of which 
worker (say worker i) is the most competent of the group for the solution of a 
given problem instance (problerrii)^ then you need to store a huge reinforcement 
table of the form {problerrii, worker i) . The storage of this table is unaffordable 
in sufficiently complex problems. If all the problem instances with some quali- 
tative similitude, were handled best by the same worker, then we might build 
reinforcement tables associating the appropriate worker to the set containing 
this collection of similar problem instances. Now the reinforcement table has 
much fewer entries, and it is of the following form [seti^ worker i) . In order to 
obtain a set description of the state space we partition it as it is explained next. 

The state space S is partitioned in s subsets: S = Ui=i that 

Si n = b when i ^ j. Similar problem instances should be contained within 
the same subset, because the algorithm works assuming the following hypoth- 
esis: if one worker is more competent for the solution of a given problem than 
the rest of the workers, then it will also be more competent than its peers in the 
solution of problem instances similar to the aforementioned (i.e. in the solution 
of problem instances that lie in the same subset of the partition). Unsupervised 
classification methods may be used for this task. The problem lies in finding and 
adequate distance function in S': a good function requires that the person that 
builds it have deep knowledge about the problem domain, and about which are 
the most predictive features in the problem representation. 

The indexed characteristic function locates the subset of the partition that con- 
tains a given problem instance: 

c : S ^ N/xeS, e{x) = i ^ xeSi 

A cumulative reinforcement table Ti = [ri . . . r^], where w is the number of work- 
ers, is assigned to every Si. The meaning of the ith entry of Ti is the cumulative 
reinforcement received by the ith worker as a result of its past decisions. The 
reinforcement tables Ti^ form the rows of the cumulative reinforcement matrix 
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n,i . . . 

_ ^ s,w _ 



where rij is the cumulative reinforcement received by agent j, as a result of its 
intervention in the solution of problems that belong to Si. 



3.3 Problem Solving Trace 

Some problems require that the decision system make a series of decisions before 
the problem can be considered as solved (as in robotics problems or games). In 
these problems, the “world” evolves following a series of states, 

Trace = [si . . . 

as a result of a series of actions/decisions taken by the MADES, 

SystemTrace = [Di . . . D^] 

The MADES makes decision Di based on the local decisions made by the agents 
at time i: di^i . . . di^na where is the number of autonomous agents in the 
MADES. 



3.4 Distributed Credit Assignment 

When the outcome of the problem solving episode is finally known, the deci- 
sion system is reinforced according to the desirability of this outcome. Since 
the decisions were made by a MADES, in a distributed fashion, the credit as- 
signed will also have to be distributed among the agents that participated in 
the decision. Let us consider that the world is in the state and the agents 
propose decisions/actions . . -di^nai the resulting group’s decision is Di. 
The distributed credit assignment algorithm distributes credit in a twofold way: 
on the one hand, credit is distributed in time (because early distant states in- 
fluence less the outcome than recent states), on the other hand, credit is also 
distributed “spatially” among the agents that take part in the distributed de- 
cision making procedure, according to their opposition/support to the decision 
that the MADES finally produced. Thus, the distributed credit assignment func- 
tion depends on the iteration i, the desirability of the final outcome o, and the 
participation of the agents that compose the MADES (which is represented by 
the local assesments/decisions they forward di^i . . . di^ua)^ 

dca:N Ag...Ag^^^ 

the output of the distributed credit assignment function dca is used as the rein- 
forcement vector at time step i, 

RV (i) = dca{i, o,di^i.. . di^jia ) 

so we get a series of reinforcement vectors, RV{1)...RV (n). 
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3.5 Reinforcement Tables Update 

The cumulative reinforcement tables Tj,j = 1 . . . s, are updated with the proper 
reinforcement vectors: 

Tc(si) = Tc{si) + RV{i) 



4 The Use of Distributed Reinforcement Learning in a 
MADES 

The distributed reinforcement learning algorithm we present can improve the 
group’s behaviour in two ways: 

1. It is a way to learn the workers competencies. This means that by comparing 
the cumulative rewards of the workers for a given problem, the advisors can 
determine which worker is expected to handle that problem best. This results 
in a group behaviour’s improvement. 

2. Workers can evolve in a way that maximizes their future rewards, which 
produces a “local” adaptation. 

5 An Instance of a MADES 

We have chosen checkers endgames as the problem domain for our experiments. 
These endgames contain 8 pieces at most, and the majority pose a considerable 
difficulty for human players [3]. We have solved the problem following the steps 
outlined in section 3. The details are shown next. 

The world is modeled as 

World = (S, A) 

where 

— S is composed by all the legal checkers situations with eight pieces at most. 
We use the following notation, sj = (C1...C32) such that Cie {w,p, b, m, ej 
where w denotes a white man, p a white king, b a black man, m a black king 
and e an empty space. 

— A is composed of a single type of action, the move of a piece from one box of 
the board to another, observing the rules of checkers. We denote it like this: 
mov{xi^yi^ X2,H2^ Capture^ CaptureList) which represents an action that 
moves the piece at location (xi^yi) to (x2,^2), removing from the board 
all pieces whose locations are specified in CaptureList when Capture is 
bounded to yes. 

The Multi Agent Decision System is modeled as 

MADES = (/S', 1, {observe, move}, {Ref}, 

{alpha AG , bayesAG, backpropAG, cA.bAG, hybrid AG} , 
{reinf, rote}, {ctrainer}, g) 



where 
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Since the input function is the unit function, no preprocessing is done upon 
the observed state. 

Ref = {S, {move}^ 1, {ask, listen}, {^4^ |J {(7i, i = 1 . . . 6}, bref) which 
means that the input the referee receives is the aforementioned set of the 
legal checkers situations {S). The action it can execute is a legal move on the 
checkers board. Again, the input function is the unit function. The only com- 
munication acts it can perform are asking for something to another agent, 
and listening to its reply. The internal cr^ states are described in the following 
control algorithm that specifies the behaviour href of fho referee. The referee 
is at internal state cr^ when the step i of the algorithm is being executed. 

1. get current state description (the current board’s situation) 

2. broadcast it to the rest of the agents 

3. receive their replies (the move they would execute on that situation) 

4. choose one of the proposals the agents forwarded 

5. execute that action (i.e. make a move on the board) 

6. if a final state has not yet been reached, then repeat control cycle 

Aw = {alpha AG, hayesAG, hackpropAG, eA.bAG, hybrid AG} is the set of 
workers, alpha AG is a simple alpha-beta searcher, hybrid AG is an alpha- 
beta searcher, that consults the reinf advisor at leaf nodes, and receives 
from it the identification of the worker that is expected to solve best that 
situation. Then, hyhridAG contacts that worker and requests its collabora- 
tion to evaluate the node. If that collaboration is not possible, the node is 
evaluated locally. The rest of the workers are classifiers that make decisions 
based on what they have learnt during their training. The specification of 
all the workers is quite similar. The input set is the same for all the workers: 
the set of legal checkers situations. The input function is different for every 
worker, because they do not share internal representations. The behavior 
function provides a proposal according to decision making formalism of the 
agent (feed- forward neural network, decision tree or whatever). 

Aa = {reinf, rote} The reinf advisor learns the competencies of the work- 
ers, and represents them in reinforcement tables. The reinforcement a worker 
obtains for the solution of a certain class of problems is used by other agents 
as an indication of how good the worker is at the task. The rote advisor 
keeps track of who is the worker that best solves a given problem by means 
of rote learning. Since it cannot generalize, it is not useful in problems with 
huge state spaces like this one, so we discontinued its use after some testing. 

= {n<s<ses, reply, ask, listen} An agent can produce a decision as result 
of an assesment (compute next move), can communicate the decision back 
to the requester, or it may ask another agent for some service (decide on 
which of the workers is the competent to solve this checkers situation, or 
recommend a move to perform next, or evaluate a checkers situation) 

At = {trainer} is a special agent built to produce checkers problems of 
tunable difficulty. 

The group behaviour is the result of a voting mechanism. Every worker votes 
for a move (the workers supported by the advisors get extra votes) , and the 
winning move is the one the MADES outputs. 




156 



J. Ignacio Giraldez and Daniel Borrajo 




5.1 Partitioning of the State Space 

The space of legal checkers situations is partitioned according to the following 
criteria: 

— number of white men 

— number of black men 

— number of white kings 

— number of black kings 

— existence of capture opportunities for white 

— existence of capture opportunities for black 

— existence of crowning opportunities for white 

— existence of crowning opportunities for black 

The indexed characteristic function c{x) counts these 8 indexes to locate the 
subset of the partition where x lies. 

As an example, we show the cumulative reinforcement table for the subset 
whose indexes are (1, 0, 2, 1, yes, no, no, yes): 

^(l,0,2,l,;yes,no,no,j/es) 

[5acA:prop(0.3333), 5a^es(-0.2053), c4^(-0.0625), a/p/za(-0.5357), hyhrid{0)] 

If the reinf advisor were consulted about who is the expected most competent 
worker, to make decisions when the current state belongs to the aforementioned 
subset, the advisor would reply with the hackpropAG identifier, since this is the 
worker with highest cumulative reward. 
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5.2 Problem Solving Trace 

A trace is kept of the series of successive checkers situations, along with the 
identification of the workers that supported the decision (action) that leads to 
the next state. As an example a part of a sample trace is shown: 



credit (proposal (e , m, e 
e, e, e, e, e, w, e, e, 
credit (proposal (e , m, e 
e, e, e, e, e, w, e, e, 
credit (proposal (e , m, e 
e, e, e, e, e, w, e, e, 
credit (proposal (e , m, e 
e, e, e, e, e, w, p, e, 
credit (proposal (e , m, e 
e, w, e, b, e, e, p, e, 



e, e, e, e, e, b, e, 
e, p, b, e, e, e, e) , 
e, e, e, e, e, b, w, 
e, p, e, e, e, e, m), 
e, e, e, e, e, e, w, 

Pj f 

e, e, e, e, e, e, w, 
e, e, e, e, e, e, m), 
e, e, e, e, e, e, w, 
e, e, e, e, e, e, m) , 



p, e, e, e, w, 
[alpha] ) . 


e, e, 


p, e, e, e, e, 


e, e, 


[c4_5, alpha]) 




e, e, e, b, p, 


e, e, 


[c4_5, bayes] ) 




e , e , e , e , p , 
[alpha] ) . 


e, b, 


e, e, e, e, p, 
[backprop] ) . 


e, e, 



5.3 Distributed Credit Assignment 

The aforementioned problem solving trace, is used to determine which workers 
agree with the main variation at every node of the game tree. The tree is tra- 
versed from leaves to root, assigning credit to the workers at every node. At a 
node at depth Cur rent Depths the workers receive the following reinforcement: 

Rein f (worker) = Result ■ Agree ■ 

where N Moves is the total amount of moves executed by the MADES in the 
endgame. Result is equal to +1 if the MADES won the game, and —1 other- 
wise. Agree is equal to +1 if worker agreed with the main variation, and 0 
otherwise (no credit assigned). The discounting mechanism is implemented in 
the CurrentDepth variable, agents are assigned less credit in the final outcome 
when their decisions are made at shallow nodes. This formula computes the en- 
tries of the reinforcement vector. The indexed characteristic function is used to 
locate the subset the current node belongs to. Then the reinforcement table as- 
sociated to this subset is fetched, and upddated with the reinforcement vector. 
This algorithm is repeated for every node along the main variation. 



6 Results 

The MADES played test games against every one of its workers, with the fol- 
lowing results: 



opponent 


MADES advantage 


C4.5AG 


21% 


backpropAG 


17.5% 


bayesAG 


17% 


hybridAG 


2% 


alphaAG 


2% 
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To explain how the MADES advantage percentage is computed, let’s use the test 
against alphaAG as a reference. A 2% advantage of the MADES over alphaAG 
means that the MADES wins 4% more test endgames than it loses. So, its real 
advantage is 4% / 2 = 2%, because if the opponent (alphaAG) won 2% more 
endgames its score would be incremented in 2% and the MADES’ score would be 
diminished by 2%, so both would be even. This can be mathematically expressed 
by this formula: 



[w{G) - 1(G)] X 100 
® “ 2 X G 

where s is the score percentage, G is the number of played games, w{) is the 
number of won games, and l{) is the number of lost games. 

These results show that the MADES beats any of its members, and this jus- 
tifies integrating the workers into the MADES. They have been obtained taking 
advantage of only one of the adaptation opportunities that the distributed rein- 
forcement learning algorithm provides: reinforcement tables have been used for 
competencies’ learning, but have not been used for the workers’ “local” adapta- 
tion (which is now underway). 

Some previous results are also worth mentioning. The MADES’ score has 
evolved from -20% (when it had only played 300 training games) to +2.03% 
(after 10832 training games). This improvement has been mainly due to the 
learning of the workers competencies. But to a lesser extent it has been due to 
worker replacements. The flexibility of the I AO model allows the replacement of a 
worker by another, and the adaption of the rest of the system to the new MADES 
composition, thanks to the adaptive behaviour of the advisors. We replaced 
two workers during the MADES’ lifetime, the first replacement improved the 
MADES’ score 1.7 points, and the second 1.17 points. 



7 Discussion 

Distributed reinforcement learning can play a double role in Multi Agent Deci- 
sion Systems. On the one hand, control information can be learnt in the form 
of a competencies map, that is a map of the state space where it is specified 
which worker is expected to handle best every kind of problem instance. Since 
it may be known which worker is the most trustworthy for the solution of a 
problem instance, the distributed decision making procedure takes this worker’s 
proposal with a special consideration. So the group’s decision making procedure 
is adapted following the predictions of the competencies map. 

On the other hand, the cumulative reinforcement the worker agents receive 
provides an indication that can be used for local adaption, which will eventually 
be noticeable in the group’s behaviour, and in how the advisors characterize 
them. The use of a different reinforcement table for every subset of the partition, 
provides an indication of how the adaption should be directed in a more detailed 
way than in other reinforcement learning proposals. 
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Abstract. Discretization of continuous attributes is an important task 
for certain types of machine learning algorithms. Bayesian approaches, 
for instance, require assumptions about data distributions. Decision 
Trees, on the other hand, require sorting operations to deal with con- 
tinuous attributes, which largely increase learning times. This paper 
presents a new method of discretization, whose main characteristic is 
that it takes into account interdependencies between attributes. Detect- 
ing interdependencies can be seen as discovering redundant attributes. 
This means that our method performs attribute selection as a side effect 
of the discretization. Empirical evaluation on five benchmark datasets 
from UCI repository, using C4.5 and a naive Bayes, shows a consistent 
reduction of the features without loss of generalization accuracy. 

Keywords: Discretization, Eeature Selection, Continuous Attributes 



1 Introduction 

Discretization is a process that divides continuous numeric values into a set of 
intervals that can be regarded as discrete categorical values. There are two main 
reasons why discretization is an important task in machine learning. The first 
one is related to the Bayesian formalism. Methods based on this formalism re- 
quire for each test example, the computation of the conditional probability of 
the class given the example: p {Cl i\ Example). For nominal attributes this prob- 
ability can be estimated with frequencies obtained from the training data. For 
continuous attributes a strong assumption about the data distribution is needed. 
Usually, in the absence of other information, we assume a normal distribution. 
As such, the conditional probability is given by the probability density function 
p{x) = . Several authors (see for instance [3,7]) note that this 

assumption is a very severe limitation of learning algorithms based on the Bayes 
formalism. 

The second motivation for performing attribute discretization is related to 
computational complexity. As it was mentioned by Catlett [1] and others, the 
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performance of tree based learners is strongly conditioned by the sorting of con- 
tinuous attributes’ values. This is an operation that on average 
takes 0{n * Log n). Using profiling tools Catlett observed on several large do- 
mains that most of the CPU time was spent on sorting. This means that pro- 
cessing continuous attributes is a bottleneck to efficient induction on very large 
training sets. 

Literature on attribute discretization abounds. A major reference in this 
subject is the work of Fayyad and Irani [5]. These authors proved that the 
limits of the obtained intervals are always in the boundary between examples of 
different classes. However, their approach has a limitation: it doesn’t take into 
account the interdependencies between attributes. This is the main advantage of 
our work: the discretization of each attribute takes into account the discretization 
of the other continuous features. 

Dougherty and his colleagues [4] define three different axis upon which we 
may classify discretization methods: supervised vs. unsupervised^ global vs. loeal 
and statie vs. dynamie. Supervised methods use the information of class labels 
while unsupervised methods do not. Loeal methods like the one used by C4.5, 
produce partitions that are applied to localized regions of the instance space. 
Global methods such as binning are applied before the learning process. In statie 
methods attributes are discretized independently of each other, while dynamie 
methods take into account the interdependencies between them. 

The following section of this paper, reviews the related work, identifying the 
main problems of the discretization of continuous features. Section three presents 
our method in detail, stressing the search for the interdependencies between at- 
tributes. On section four we perform an empirical evaluation of the method using 
two well known algorithms, C4.5 and a naive Bayes on five benchmark datasets 
from the UCI repository. Finally, we present some conclusions and future work. 

2 Related Work 

2.1 Main Problems 

Three main problems are addressed in the existing literature on discretization. 
The following sections present a brief discussion of these issues. 

How Many Intervals to Consider? We can find in the literature several 
approaches: those that fix the number of intervals in advance and those in which k 
is automatically set. As examples of the former, C4.5 sets the number of intervals 
to 2, and in [3] the number of intervals is fixed to 10. As examples of the latter, k 
can be computed taking into account the number of distinct values observed 
on the training set, for instance k = max (1,2 ^ log 1)) ([4]), or using Cross 
Validation [12]. 

This last strategy has one advantage: given a dataset with several continuous 
features, the number of intervals of each feature depends on the number of 
different values observed on the training set. As such, different features can be 
discretized with different number of intervals. 
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How to Allocate an Observed Value to an Interval? The simplest dis- 
cretization procedure, often used, is known as Equal interval width. It divides 
the rank of observed values for a feature into k equally sized bins. The only 
advantage of this unsupervised method is its simplicity. The main drawback is 
the influence of outliers. 

Another simple discretization procedure. Equal frequency intervals^ divides 
a continuous feature into k bins where (given m instances) each bin contains 
m/k values. A variation of this method, known as Maximal marginal entropy^ 
starts with equal frequency intervals and adjusts the boundaries to decrease the 
entropy of each interval. 

A more sophisticated approach is known in the literature as k-means. The 
distribution of the values over the k intervals minimizes the intr a- interval dis- 
tance and maximizes the inter-interval distance. It begins with an equal width 
distribution of the observed values over the k intervals, followed by an iterative 
process where the values near the boundaries change between intervals while this 
process improves the criterion mentioned above. 



How to Choose the Representative Value for Each Interval? The usual 
method chooses the mean of the values that fall on this interval. Due to the 
influence of outliers some authors prefer to use the median. 



2.2 Other Systems 

We now briefly describe some of the existing discretization systems. 



Unsupervised Methods The ChiMerge system [9] begins by placing each 
observed real value into its own interval and proceeds by using the f^st to 
determine when adjacent intervals should be merged. 

The StatDisc [10] method also uses statistical tests as a means of determining 
discretization intervals. This is also a bottom- up method that creates a hierarchy 
of discretization intervals using the (j) measure as a criterion for merging intervals. 
StatDisc can merge N adjacent intervals at a time. Merging of intervals continues 
until some cj) threshold is achieved. The final hierarchy of discretizations can be 
explored and a suitable final discretization automatically selected. 



Supervised Methods The IR system, presented by Holte [6], attempts to 
divide the domain of every possible continuous variable into pure bins, each con- 
taining a strong majority of one particular class. This method works reasonably 
well when used in conjunction with the IR induction algorithm. 

Catlett [2] has explored the use of entropy based discretization in decision 
tree algorithms. Empirical studies have shown an impressive increase in induc- 
tion speed on very large datasets. This work was the precursor of Fayyad and 
Irani’s [5] recursive entropy minimization heuristic for discretization. To control 
the number of intervals produced over the continuous space they use a Minimum 
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Description Length criterion. In the original paper, this method was applied lo- 
cally at each node during tree generation. Some other authors have used the 
method as a global discretization with good results [4]. 

Robnik-Sikonja and Kononenko [11] have applied the ReliefF algorithm to 
discretization tasks. ReliefF is a method for estimating the probability that two 
near examples of the same class have the same value for an attribute and that 
two near examples from different classes have the same value for the attribute. 
The ReliefF is a heuristic measure to decide which of two discretizations is the 
“best”. 

3 Our Method 

We can look at all possible discretizations of a continuous feature as a hierarchy. 
The most general discretization is at the top of this hierarchy, and consists of 
one interval containing all the observed values. At the bottom of the hierarchy 
we have the most specific discretization which is a set of intervals, each con- 
taining a single value. For more than one feature we have a set of hierarchies. 
Static discretization methods consider that the discretization of one feature is 
independent from the discretization of the others. Our proposal is to perform a 
search over all the hierarchies of possible discretizations for all the attributes. By 
proceeding this way the discretization of one feature depends on the discretiza- 
tion of other features. Thus we are able to explore inter-dependencies between 
features. 

The available data is split into two disjoint datasets. The first one, referred 
to as the training set^ is used to build the hypothesis of possible discretizations. 
The second one, the validation set^ is used to evaluate the hypothesis^. 

For each continuous attribute, the boundary cut-points are collected from the 
training set. A cut point is defined as the midpoint between each successive pair 
of values in the sorted sequence of attribute values. A boundary cut-point is a cut 
point involving examples of different classes. As proved by Fayyad and Irani [5] 
the minimum of any entropy based measure must occur at a boundary cut-point. 

Our approach performs an A* search over the set of hierarchies defined by 
the boundary cut-points of each attribute. The basic goal of the search is to 
determine the number of intervals into which each attribute will be divided. 



3.1 The Search Space 

The search space is defined by all the possible combinations of attribute dis- 
cretizations. 

It consists of ni * n 2 * ... * states, where Ui is the number of em boundary 
cut points of attribute i and n is the number of continuous features. 

A state on this search space can be described by a vector of integers, [ui,...,n^], 
where Vi is the number of intervals used to discretize attribute i. If Vi = 1 this 



1 



In the experimental study, we use 30% of the data for the validation set. 
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means that attribute i does not contribute to class discrimination. Attribute i is 
irrelevant and will be ignored. 

The most general discretization corresponds to the vector and the 

most specific corresponds to the vector [ni, ..., n^]. The search space can be seen 
as a lattice: 



Initial state 




f(s) 




Goal State 



Fig. 1. The search space lattice 



The initial state of the search engine consists of the vector mean- 

ing that no continuous attribute contributes to class discrimination. Each state 
of the search lattice defines a discretized dataset. We evaluate each state of 
discretization by calling the target learning system with the corresponding dis- 
cretized dataset. The learned theory is evaluated in the validation set, and the 
obtained score is taken as a quality assessment of the tried discretization. 

The search through the space of discretizations is done with a specialization 
operator. On the search tree, each node has n descendants. On each descendant, 
the number of intervals of one attribute grows by a user defined parameter. 

Consider, for example, a problem with 3 continuous attributes. Each state is 
given by a 3 dimensional vector, for example, [4, 2, 6] meaning that the first con- 
tinuous attribute was divided into 4 intervals, the second attribute was divided 
into 2 intervals and the third attribute was divided into 6 intervals. Assuming 
that the interval increment is set to 2, this node will have three descendants: 

[6, 2, 6], [4, 4, 6] and [4, 2, 8]. 

The next node to expand is the one not yet expanded and with lowest value of 
an objective function which will be described bellow. 
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3.2 States Evaluation 

The evaluation of a candidate set of intervals is done as follows. The training and 
the validation data are discretized using the candidate set. This means that for 
each example the value of a continuous attribute is replaced by the representative 
value of the interval where the attribute- value falls. 

To discretize a dataset, we need to know not only the number of intervals 
but also their limits. We apply a k-means procedure to the boundary cut-points 
vector of each continuous attribute. For the attribute z, the k parameter, i.e. the 
number of intervals, is given by the Vi of the vector of the current hypothesis. 
The output of the k-means procedure are the boundaries and the representative 
values of each interval. Presently we use the mean as the representative value for 
each interval. 

The objective function we minimize is a combination of two functions: /and g. 
The first one, function / measures the distance to the goal. It is computed as 
the error rate on the validation dataset, discretized using the actual hypothesis. 
Obviously the goal is an error rate near zero. 

The second function measures the distance from the initial state. It is com- 
puted as g{v) = where /c is a user-defined parameter to avoid 

the dominance of g in the objective function. The intuition behind the use of 
function g is that between two different states that evaluate the same value for 
function / we prefer the one which minimizes the number of intervals. The set 
of intervals is computed from the values observed on a learning set. If there is a 
large number of intervals, small variations on the training set will be propagated 
on to the set of intervals. By minimizing the number of intervals, we are also 
reducing the dependence of the set of intervals from the training set. This fact 
will have positive influence on the variance of the generated classifier. 



3.3 Terminal Conditions 

There are two terminal conditions. The first one occurs if the evaluation of 
function / returns 0. 

The second condition occurs when, for several iterations of the state’s ex- 
pansion, there is no improvement of the objective function. Presently we use the 
following heuristic: after 3 expansions without improving the objective function, 
the system linearly increases the step. After 3 new expansions without improve- 
ment, the discretization process stops. 

3.4 Discussion 

The proposed method requires one sort operation for each continuous attribute. 
A normal Decision Tree requires one sort operation for each node and each 
continuous attribute. 

At each evaluation of the search procedure, we apply a k-means procedure 
that has no proof of convergence, although it appears to always converge in a 
finite number of iterations [8]. It also can be computationally more expensive 



166 Joao Gama et al. 



than the sort operation. For large datasets the described procedure requires 
much more time than generating a single Decision Tree. 

Another problem occurs when all or almost all the continuous attributes are 
relevant for class discrimination. In this case the starting point is far from any 
solution. Solutions for those two problems are presented in the next section. 



3.5 Extensions for Large Datasets 

For large datasets, we can consider that we have enough data points. The k-means 
procedure is a heavy artillery and useless. Instead of using the iterative standard 
procedure we apply one single convergence step, that reduces the complexity to 
0{n) in the worst case. 

The lattice in the figure 1, suggest a search strategy that could solve the 
second problem. We can consider, on one hand an ascendent branch that goes 
from the most general to the most specific, and on the other hand a descendent 
branch that goes from the most specific to the most general. The initial state, 
for the ascendent branch, consists on the vector [!,...,!] . The initial state, for 
the descendent branch, consists on the vector [Vi, The ascendent search 

works the same way as the descendent one, only in the opposite direction. On 
the example of the previous section, the state [4,2,6] will have three descendants: 
[6, 2, 6], [4, 4, 6] and [4,2,8] if it is on the ascendent branch, or [2, 2, 6], [4, 1,6] and 
[4,2,4] if it is on the descendent branch. 

The search engine can skip from the ascendent branch to the descendent 
branch, because the next node to expand is the one not yet expanded and with 
lowest value of the objective function. 



4 Empirical Evaluation 

The method was evaluated on 5 well known datasets from the UCI repository. 
Table 1 shows some basic dataset characteristics. In the last column we present 
the sum of boundary cut-points for all continuous attributes in the dataset, 
providing a hint on the size of the search space. The evaluation was done using 
10 fold cross validation. For each fold, the dataset is split into a training set and 
an independent test set. 



Dataset 


Nr. Examples 


Classes 


Attributes 


Continuous 


Nr. Boundary cut-points 


Australian 


690 


2 


14 


6 


862 


Diabetes 


768 


2 


8 


8 


1048 


German 


1000 


2 


24 


7 


209 


Heart 


270 


2 


13 


7 


317 


Iris 


150 


3 


4 


4 


111 



Table 1. Datasets Characteristics 
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The training set is used by our algorithm, Discretizer, to generate the set of 
intervals as explained before. Internally, this dataset is divided into a train and a 
validation set. Discretizer outputs for each continuous attribute a set of interval 
boundaries and the representative value of each interval. This output is used to 
discretize both the training set and the test set. 




Fig. 2. A simplification of the search space for the Iris problem 

Tables 2 and 3 respectively present the results of applying C4.5 and naive 
Bayes to both the original and discretized datasets. 

For each dataset we show both error rates, the significance of the difference 
of the means from t-paired tests^ under the null hypothesis that the mean of the 
errors are equal, and the last column shows the mean number of used attributes 
in the discretized version of the dataset. 



Dataset 


C4.5 


Discretizer 


ttest 


Used Attributes 
Mean 


Australian 


14.9 ±2.6 


12.5 ±3.2 


-97% 


2.5 


Diabetes 


25.9 ±3.3 


25.3 ±3.8 


-40% 


4.5 


German 


28.9 ±4.4 


27.2 ±3.1 


-91% 


3.5 


Heart 


22.2 ±6.1 


25.6 ±8.1 


79% 


3.6 


Iris 


4.0 ±4.7 


5.3 ±4.2 


65% 


3.0 


Means 


19.2 


19.0 







Table 2. Error rates using C4.5 from 10 Cross Validation 



With respect to error rates^ the trend indicated by these results is that learn- 
ing on the discretized dataset competes well against learning on the original 
dataset. Three times we observed an improvement, with statistical significance 
confidence levels, on Australian and German datasets using C4.5, and on Aus- 
tralian data using Bayes. Only once we observe a significant degradation of the 
error rate, German data using Bayes. Most important is the effect on feature 
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Dataset 


Bayes 


Discretizer 


ttest 


Used Attributes 
Mean 


Australian 


22.5 ±4.9 


14.2 ±3.4 


-100% 


2.4 


Diabetes 


26.3 ±6.4 


25.8 ±6.3 


25% 


5.5 


German 


25.8 ±4.4 


26.3 ±3.6 


99% 


3.3 


Heart 


15.9 ±6.1 


16.7 ±8.1 


27% 


4.0 


Iris 


4.0 ±5.0 


6.7 ±4.2 


56% 


3.6 


Means 


18.9 


18.7 







Table 3. Error rates using Bayes from 10 Cross Validation 



reduction. In almost all the experiments, redundant features are determined and 
eliminated from the discretized dataset. In some folds of the cross validation 
procedure, all the continuous features are removed. Most of the times, only half 
of the original continuous features are considered to be relevant. 

Figure 2 visualizes part of the search space for a simplified version of the Iris 
dataset. We only consider two attributes: petal and sepal width. For different 
number of discretization intervals we plot the error rate obtained using C4.5. 
The minimum error rate is obtained using 5 intervals for petal width and is 
independent of the attribute sepal width. 



5 Conclusions 

According to Dougherty’s terminology, the method we propose is global, super- 
vised and dynamic. Global means that it is used as a pre-processing method. 
Discretization is performed previously to the learning phase. The method is 
supervised because the possible intervals are defined in terms of boundary eut 
points. And it is dynamic because all attributes are discretized in an interdepen- 
dent way. 

In the empirical evaluation carried out, we only observed once a significant 
error rate degradation, while a significant increase of performance occurred three 
times. 

The most significant aspect of the proposed method is related with its ability 
to perform feature selection while discretizing continuous features. For instance, 
on the Australian dataset, we observed a significant increase of performance using 
only 2 or 3 from the 6 original continuous attributes. The dynamic discretization, 
which looks for interdependencies between features, detects redundant attributes 
and performs feature selection as a side effect. The proposed method seems to 
be effective on detecting irrelevant features. 

The formulation that we have described of the search procedure, is naturally 
extended for parallel computation. Instead of setting one or two starting points, 
we can have several points distributed over the search space, and perform a 
parallel search. The search engine is not restricted to A* algorithms. We intend 
to use Genetie algorithms to perform the search. 
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Abstract. The next generation of intelligent information systems will rely on 
cooperative agents for playing a fundamental role in aetively searching and 
finding relevant information on behalf of their users in complex and open 
environments, such as the Internet. On the other hand, the relevance of such 
information is a user-dependent notion within the scope or context of a particular 
domain or topic. Previous work, mainly in information retrieval (IR), focuses on 
the analysis of the content by the means of keyword-based metrics. Some recent 
algorithms apply social or collaborative information filtering to improve the 
task of retrieving relevant information and for refining each agent’s particular 
knowledge. In this paper, we combine both approaches developing a new 
content-based filtering technique for learning up-to-date users’ profiles that 
serves as basis for a novel collaborative information- filtering algorithm. We 
demonstrate our approach through a system called RAAP (Research Assistant 
Agent Project) devoted to support collaborative research by classifying domain 
specific information, retrieved from the Web, and recommending these 
“bookmarks” to other researchers with similar interests. 



1 Introduction 

Undoubtedly, in the next generation of intelligent information systems, cooperative 
information agents will play a fundamental role in actively searching and finding 
relevant information on behalf of their users in complex and open environments, such 
as the Internet. Relevance, on the other hand, can only be defined for a specific user, 
and under the context of a particular domain or topic. Because of this, the 
development of intelligent, personalized, content-based, document classification 
systems is becoming more and more attractive now Furthermore, learning profiles 
that represent the user’s interests within a particular domain, later used for content- 
based filtering, has been shown to be a challenging task. This becomes more difficult 
if the relevant set of attributes for each class changes in time, what makes the problem 
even not suitable for traditional, fixed-attribute machine learning algorithms. 
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Documents, as well as user’s profiles, are commonly represented as keyword 
vectors^ in order to be compared or learned. With the huge variety words used in 
natural language, we find ourselves with a noisy space that has extremely high 
dimensionality (lO"^ - 10^ features). For a particular user, it is reasonable to think that 
processing a set of eorrectly elassified relevant and irrelevant documents from a 
certain domain of interest, may lead to identify and isolate the set of relevant 
keywords for that domain. Later on, these keywords or features ean be used for 
distinguishing doeuments belonging to that eategory from others. Thus, these user- 
domain speeific sets of relevant features, that we will eall prototypes, may be used to 
learn to classify documents. It is interesting enough to say that these prototypes may 
ehange over time, as the user develops a particular view for eaeh elass. The problem 
of personalized learning of text elassification, is in fact, similar to the one of on-line 
learning, from examples, when the number of relevant attributes is much less than the 
total number of attributes, and the coneept function changes over time, as deseribed in 
[ 1 ]. 

From a different perspective, cooperative multi-agent systems implieitly share 
“social” information, which can be potentially used to improve the task of retrieving 
relevant information, as well as refining each agent’s particular knowledge. Using this 
fact, a number of “word-of-mouth” collaborative information filtering systems, have 
been implemented as to recommend to the user things of interest. This is done based 
on the ratings that other correlated users have assign to the same objeet. Usually this 
idea has been developed for specific domains, like “Music” or “Films” as in Firefly' 
and Movie Critie , or for introdueing people (matchmaking) as in Yenta[2]. A mayor 
drawback of these systems is that some of them completely deny any information that 
ean be extracted from the eontent. This ean be somehow eonvenient for domains that 
are hard to analyze in terms of content (sueh as entertainment), but definitely not 
suitable for textual content-driven environments such as the World-Wide Web 
(WWW). Besides, these systems usually demand from the user, a direet intervention 
for both classifying and/or rating information. 

In this paper we deseribe a multi-agent system called RAAP (Research Assistant 
Agent Project) that intends to bring together the best of both worlds - Content-based 
and Collaborative Information Filtering. In RAAP, personal agents helps its users 
(researchers) to classify domain specific information found in the Web, and 
recommends these URLs to other users with similar interests. This eventually brings 
benefits for all the users within an organization, as information re-diseovery is 
avoided and only peer-reviewed documents are recommended among them. This is 
partieularly useful while involved in information and knowledge intensive 
eollaborative work, such as scientific research. 



^ This is called the Vector Model and has been widely used in Information Retrieval (IR) and 
AI. 

Also ealled Social Filtering [3] 
http ://www. firefly.net 
http://www.moviecritic.com 
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2 Description of RAAP 



The Research Assistant Agent 
Project (RAAP) is a system 
developed with the purpose of 
assisting the user in the task of 
classifying documents 

(bookmarks), retrieved from the 
World Wide Web, and 
automatically recommending 
them to other users of the 
system, with similar interests. In 
order to evaluate the system, we 
narrowed our objectives to 
supporting a specific type of 
activity, such as Research 
^Development (R &D) for a given domain. In our experiment, tests were conducted 
using RAAP for supporting research in the Computer Science domain. 

RAAP consists of a bookmark database with a particular view for each user, as well 
as a software agent that monitors the user’s actions. Once the user has registered an 
“interesting” page, his agent suggests a classification among some predefined 
categories, based on the document’s content and the user’s profiles (Fig. 1). Then the 
user has the opportunity to reconfirm the suggestion or to change classification into 
one, that he or she considers best for the given document as shown in Fig.2. 




Figure 1 Agent’s Suggestion 




Figure 2 Classification 



Figure 3 Recommendation 



In parallel, the agent checks for newly classified bookmarks, and recommend these 
to other users that can either accept correct them when they eventually login the 
system and are notified of such recommendation, as illustrated in Fig.3. 

During the first time registration into the system, the user is asked to select 
his/hers research areas of interest. This information is used to build the initial profile 
of that user for each class. The agent updates the user’s profile for a specific class 
every time certain number k of documents are successfully classified into it. In that 
way, RAAP only uses up-to-date profiles for classification, reflecting always the latest 
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interests of the user. During this proeess the agent learns how to improve its 
elassification and narrow the scope of the people to whom it recommends in a way we 
shall explain in the following sections. 



3 Learning to Classify 

In our system, a document belonging to a user s as a finite list of terms, 
resulting of filtering out a stoplist of common English words, from the original 
document fetched from the WWW using its URL. These documents are either 
explicitly “bookmarked” or “accepted” by the user. In case of being “rejected”, 
sometimes it can also be “saved” by the user’s agent as negative example. Under this 
notion, and from now on, we will be using the words “document”, “page” and 
“bookmark” without distinction. We will also be using “class”, “category” and 
“topic” in the same way. 

Because we are trying to build a document classification system that learns, it is of 
our interest to keep track of the positive and negative examples among the documents 
that have been already classified. In RAAP, positive examples for a specific user Ui 
and for a class Cj, are the documents explicitly registered or accepted by Ui and 
classified into Cj_ Note that accepting a recommended bookmark is just a special case 
of registering it. On the other hand, negative examples are either deleted bookmarks, 
misclassified bookmarks or rejected bookmarks that happens to be classified into Cy 
In the case of rejected bookmarks the document is only classified by the agent - we 
don’t ask the user to reconfirm the classification for rejected bookmarks. In this sense, 
and as measure of precaution, we only take rejected bookmarks that fall into a class in 
which the user has at least one bookmark correctly classified (a class in which the 
user has shown some interest). 

• Let C^. be the set of all documents classified as positive examples for user Uj 
and class Cj 

• Let C- j be the set of all documents classified as negative examples for user Ut 
and class Cj 

Now for a given user, we want to build a classifier Q for each category, as a list of 
terms, in order to apply the dot product similarity measure (Eq. 1), widely used in IR 
for the purpose of ranking a document D respect a given query. The classifier most 
similar to the document would indicate candidate class. 

(1) sim (Q,D) = Y, w(t,Q)w(t,D) 

Te Q 

For term weighting, we chose TF-IDF (Eq. 1.1), as it is one of the most successful 
and well-tested term weighting schemes in IR. It consists of the product of the 
term- frequency (TF) of term rin the document d, by idfj=log2{Nldtj)+l, the inverse 
document frequency (IDF), where N is the total number of documents of collection 
and dtj is the total number of occurrences of r in the collection. Note that we 
maintain a different collection for each user. 
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3.1 Relevance Feedback 



Given the unique terms in g, P={Ti, T 2 , T3 ,. . , named P as for Prototype, it is it 
easy to see that both Q and D can be represented as numerical vector^ containing the 
respective weights of these terms in Q and D, denoted as Q and J) respectively. 
Their dot product^ is, in fact, the similarity measure explained before. We can also 
express Q and D as numerical vectors, containing the term frequency of each 
elements of P within them, denoted respectively, as: 



Tf(D) =< > 

Tf(Q) =< > 



We can now describe mathematically a very simple approach to the process of 
relevance feedback as: 



( 2 ) Tf {Q‘) + aTf {D) 



Where a^ViiDs Clj 
Qr = -lif£»E CTj 



And then, recalculate based on the values of 



Another approach found in the literature, is the Rocchio’s Algorithm [3]: 



P-i) Q = j/n E D 




ZJ 



Note that 



c; 



^cr 



Usual values are: 
P=4 and 7^4P 



The basic problem of these algorithms and the main reason why we couldn’t use 
them for our system is that they do not take into account the possibility that the 
dimension of the vectors and may change in time. In other words, that unique terms 
listed in P can be added or deleted in order to reflect the user’s current interests 
(feature selection). 

Perhaps Rocchio’s algorithm can be adapted to be recalculated each time a unique 
term is added or deleted to P, but the computational cost would be very high. This is 
without mentioning the size of the complement of the positive examples, used to 
calculate the negative part of the formula. 



^ The dot-product between the numerical vectors of Q and D is denoted as ( g o D) > 
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Instead, we propose a new algorithm for incrementally building in order to reflect 
effectively the user’s current interests. We shall give at first a couple of useful 
definitions: 

Definition 1: We define positive prototype for a class Cf, user Uj at time t, 

p(0+ ^ \ 

as a finite set of unique indexing terms, chosen to be relevant for Cj, up to time t 
(see feature selection). 

Definition 2: We define negative prototype for a class Ci, user Uj at time t, 

/vts q, A(q, >o) 

as a subset of the corresponding positive prototype, whereas each one of its 
elements can be found at least once in the set of documents classified as negative 
examples for class Ci 

Now we construct the vector of our positive prototype as follows, 

At time t + \ 
if =p«-){ 

Tf (Qlp * ) = Tf (Qiy ) + Tf {Diy ) 

Up date based on Tf{Qfp^) 

} else{ 

/orall re do{ 

calculate w(r, J) for the ^-most recently processed 
documents g C(. and update these values in 

}}} 

Where n is the number of documents used as basis for feature selection. 



This algorithm is to be applied in the same way for the negative prototype. We can 
now re-define the similarity measure between a class Cj and a document D as: 



( 3 ) 






)( 0 - 

'J 




Expressing this equation in words, we should say that for a given user Ui, the 
similarity between a class Cj and an incoming document D at time t, is equal to the 
similarity of D, with respect of the classifier of the corresponding positive prototype 
minus the similarity of D, with respect of the classifier of the corresponding negative 
prototype. This intuitively says that a document is similar to a class if its similar to the 
class positive prototype and not similar the class negative prototype. It is important to 
say that the initial positive prototype for each class is a list of selected core keywords 
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from that domain that were integrated into the system to provide at least an initial 
classification. 

Finally, we use a heuristic for RAAP's ranking. This heuristics states that it is more 
likely that a new document is to be classified into a class in which the user has shown 
some interest before. We chose the class with the highest ranking among these. 

3.2 Feature Selection 

Automatic feature selection methods include the removal of non-informative terms 
according the corpus statistics. In comparative studies on feature selection for text 
categorization [4], information gain (IG) is shown to be one of the most effective for 
this task. Information gain is frequently employed as a term-goodness criterion in the 
field of machine learning [5]. 

Given all these interesting properties we decided to use IG for selecting 
informative terms from the corpus. We re-define the expected information gain that 
the presence or absence of a term x gives toward the classification of a set of pages 
(S), given a class Cj and for a particular user user Uf as: 

Eij (r, S) = I{S)- [P{T = present ) + P{t = absent ) 

(4) where , 

ce{CV,Cr,.} 

In Eq. 4, P{z=present) is the probability that! is present on a page, and {Sp=presen^ 
is the set of pages that contain at least one occurrence of r and Sc are the pages 
belonging to class c. 

Using this approach, in RAAP, the user’s agent finds the k most informative words 
from the set S of the n most recently classified documents. As in Syskill & Webert [6] 
we chose A:=128 and arbitrary selected /7=3 for our experiments. 

Out of the selected 128 terms, 28 are to be fixed, as they constitute the core list of 
keywords or a basic ontology for a topic, given for that class as an initial classifier. 
Within the rest 100 words, we adopt the following scheme for adding or replacing 
them in the positive prototype: 

1. Perform stemming over the most informative in order to create the list of 
terms. 

2. Replace only the terms that are in the prototype but not in the list of the most 
informative terms. 

3. As shown in the algorithm for constructing the classifier, update the weights 
of the newly added or replaced terms with respect of the n documents 
processed by the agent for this purpose. 

We conclude this section saying that even if IG is a computationally expensive 
process, in RAAP this is drastically improved both by having n low and only updating 
the weights for the selected terms only with respect of these documents. We also 
provide a mechanism in which the “memories” of the terms that repeat in time, are 
left intact, given that their accumulated weight and information value is high. 
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4 Learning to Recommend 



For the purpose of learning to who 
reeommend a page saved by the user, the 
agent counts with two matrixes. They are 
the user vs. category matrix the 

user’s confidence factor, where m is the 
number of users in the system and n the 
number of categories. The first one is 
automatically constructed by counting, for that user, the number of times a document 
is successfully classified into a certain class. During the initial registration in RAAP, 
the matrix is initialized to one for the classes that the user has shown interest. 

The first idea was the user-category matrix to calculate the correlation between a 
user u^ and the rest, using the Pearson-r algorithm (Eq. 5). Then recommend the 
newly classified bookmarks to those with highest correlation. 





John 


Kato 


IshiiJoacc 


Rtia 


htbnTiatbn Retrfevc 


7 


7 


3 


4 


7 


TemDoral Reasonric 


4 


3 


2 


0 


4 


CBR 


1 


3 


7 


2 


3 


DistrbutEdAI 


2 


5 


3 


7 


2 



Table 1 User-Category Matrix 



correl {u^,u^) = 



( 5 ) 
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where Uj,i 



Z «jj 



-;je {x,r} 



One problem with this approach is that the correlation is only calculated as an 
average criterion of likeness between two users, regardless of the relations between 
the topics. That is, if the agent decides to recommend a bookmark classified into a 
class X, but it happens to be that its user is highly correlated to another user based on 
the values in the matrix respect other classes, then the bookmark would be 
recommended to that user anyway. These classes can be presumably unrelated to the 
class of the bookmark, which is undesirable since we only want the agents to 
recommend bookmarks to people that have interest in the topics to which it belongs or 
in related ones. What we really want is to give more weight in the correlation 
between users to those classes more related to the class of the bookmark that is going 
to be recommended. For this reason we introduce the concept of similarity between 
two classes for a user Uf at time t, as the dice coefficient between the positive 
prototypes of the classes. 

Where |AnB| is the 
number of common terms, 
and |A| is the number of 
terms in A. 
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Given the class of the bookmark Cj, the class similarity vector is defined as: 
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( 6 . 1 ) iCj,c^),reV. {c j,c^),...,rell (Cj,cJ > 

where n=#of classes 

Now we multiply this vector by the user-category matrix obtaining a weighted, 
user-category matrix. 

(7) WM = RjXM 

Using this new User-Category matrix similar to the one shown in Table 1, but with 
modified (weighted) values, we proceed to calculate the weight between the subject 
user Ux (who recommends) and the others Uf (candidates to receive the 
recommendation) as the correlation between them multiplied by the confidence factor 
between them. 

(8) Weight (i/^ , 1 / . ) = correl confidence {u^ u . ) 

In Eq. 8, the confidence factor of user Uf with respect Uj, is a function with a range 
between 0.1 and 1. It returns I for all new users respect others, and it decreased or 
increased by a factor of 0.01 every time a bookmark recommended by user is 
accepted or rejected by Ui respectively. Note that confidence(u^, Ui)^ confidence(ui, 
uf. This means that the confidence is not bi-directional, but differs for every 
combination of pair of users. 

For deciding to who recommend we used a threshold of 0.5 for the minimum 
weight, as well as recommending to at most tofn)= \ l/(^-264))+5] number of users, 
where n is the total number of users in the system. We use fivi) to maintain a 
reasonable proportion of the number of users that are selected as recipients for the 
recommendation, respect the number of registered users that at some moment of time 
can be huge. 

To avoid circular references in the recommendation chain, the agents verify that 
the recommended document is not already registered in the target’s database. 



5 Experimental Results 

In order to evaluate the system we set up an experiment with 9 users in our 
laboratory. They were asked to interact freely with RAAP during one week, 
registering home pages only with content relevant to their current research interests. 
An update of the users’ prototype for a certain class was executed every 3 bookmarks 
classified into that class. 

• Bookmark Classification 

A number of 72 bookmarks were registered and classified by the agents/users into 
a total of 9 different classes. We give the results of the classification in Table 2. 
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User ID 


Correct 


Incorrect 


Total 


Accuracy 

(%) 


#of 

Updates 


23 


4 


3 


7 


57.1 


0 


25 


3 


0 


3 


100.0 


0 


26 


3 


7 


10 


30.0 


1 


27 


12 


6 


18 


66.7 


4 


28 


7 


10 


17 


41.2 


2 


30 


0 


2 


2 


0.0 


0 


31 


4 


2 


6 


66.7 


0 


32 


1 


0 


1 


100.0 


0 


34 


2 


6 


8 


25.0 


0 



Table 2 Document Classification Accuracy 

The first thing we should notice is that results were quite different for each user. 
The usage of the system also varied quite a lot. In this experiment only 2 users 
(ID =27,28) received 2 or more updates of there profiles, having also the largest 
amount of registered bookmarks. For ID=27, an average of 66.7% of accuracy was 
achieved while 4 profile updates occurred. This should be compared with ID=28 that 
had a somewhat low average of 41.2% of accuracy with 2 updates and ID=26 with 
30.0% and only I update. This growing pattern suggests that when more updates 
occur, better classification is achieved. For the cases where there was no updating at 
all, an average of more than 55% of accuracy was obtained. But we should be very 
carefull with this result, since the initial, predefined, selection of keywords for each 
class was no guarantee for a correct classification. More learning was supposed to 
occur in order to make a better evaluation. 

• Bookmark Recommendation 

As we can see in Table 3, the overall acceptance rate was quite high for the 
majority of the users. In total, there were 74 recommendations, 52 (70%) of which 
were accepted. Comparing this with the results in bookmark classification, it is easy 
to realize that, in general, the acceptance rate was lower for those users that didn’t 
receive any update in their profile such as ID=23, 34. 



User ID 


Accept 


Reject 


Total 


Accuracy (%) 


23 


3 


8 


11 


27.3 


25 


6 


0 


6 


100.0 


26 


9 


1 


10 


90.0 


27 


13 


2 


15 


86.7 


28 


7 


3 


10 


70.0 


30 


8 


1 


9 


88.9 


31 


2 


1 


3 


66.7 


32 


4 


5 


9 


44.4 


34 


0 


1 


1 


0.0 



Table 3 Recommendation Acceptance Rate 
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Out of 70 registered bookmarks only 42 were unique, whieh means that 30 of them 
(41.7% of all the bookmarks) actually came from recommendations. This indicates 
that intelligent information sharing and collaborative filtering occurred in high degree. 

As for the amount of data used for the experiment we should recognize it falls 
short but it gave us a general idea of the behavior of the system. 

Finally, for a more general evaluation, we must point out that for RAAP there was 
no training data available, other than the bookmarks itself. The learning was 
performed on-line and incrementally throughout the interaction with the system. 
Thus, RAAP cannot be directly compared with traditional of-line, text categorization 
algorithms. In spite of this, our classification algorithm showed to be satisfactory, to 
the extent that in the majority of the cases the user didn’t even need to rectify the 
agent’s suggestion. The relatively high percentage of accepted recommendations 
showed that its not only feasible to support collaborated filtering on content-based 
filtering, but also that with the increase of relevant data as product of the 
recommendations, the classification accuracy is very likely to improve. 



6 Related Works 

There are a several related works that intend to filter and recommend to the user 
“interesting” Web pages. We can classify them into those using content-based 
filtering techniques and those using collaborative filtering. 

Among the content-based filtering systems we can mention Sy skill & Webert [6], a 
system that builds user’s profile using Expected Information Gain, and compares the 
effectiveness of several Machine Learning algorithms for the task of classifying a 
page as interesting or not for the user. A main difference with our system is that, in 
Syskill & Webert, the domains of the set of web pages used for training and testing the 
algorithms are previously decided. In other words, this system only recommends to 
the user pages within a specific topic, extracted from a previously decided online 
directory, or a list of pages that result from a query to search engine such as LYCOS. 
It does not perform text categorization of a new document — at least not among the 
domains; nor it gives any advice about whether the domain itself is in fact interesting 
or not for the user! Another difference is that the user’s profile is built only once and 
is not automatically updated afterwards. Learning is performed off-line, with the need 
of training set with previously rated pages. An another similar system is WebWatcher 
[7], which recommends hyperlinks within a Web page, using the TF-IDF document 
similarity metrics also used in our system. 

Collaborative filtering systems are more rare in the literature and currently oriented 
more to commercial systems that perform recommendations in the entertainment 
domain, such as Movie Critic and Firefly, as we mentioned in the introduction. Some 
more classical systems are Ringo [8], a music recommending system (upon which 
Firefly is based) and Grouplens [9], a system that personalized selection of Netnews. 
Both systems employ Pearson-r correlation coefficients to determine similarity 
between users, regardless of the content of the information being recommended. In 
any case the user is asked to rate the content, using some predefined scale, in order to 
calculate this correlation. 
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Up to the time of the writing of this paper, there have been few reported systems 
that try to combine both techniques. In the matchmaking domain, that is, 
recommending people, instead of recommending documents, we can mention Yenta 
[2], and Referral Web [10]. These systems somehow perform keyword based textual 
analysis of private and public documents in order to refine their recommending 
algorithms that are originally based on collaborative filtering techniques. RAAP 
differs to these systems in several ways, being the objective to filter documents and 
performing on-line learning of user’s profiles. These profiles are later used not only to 
match similarities among people but also among personal domains of interests. 



7 Conclusion and Future Work 

The contributions of this paper are threefold: 

1) We proposed the combination of content-based information filtering with 
collaborative filtering as the basis for multi-agent collaborative information 
retrieval. For such purpose the system RAAP was explained in detail. 

2) A new algorithm for active learning of user’s profile and text categorization 
was introduced. 

3) We proposed a new algorithm for collaborative information filtering in which 
not only the correlation between users and also the similarity between topics 
is taken into account. 

Some experimental results that support our approach were also presented. As 
future work we are looking forward test RAAP in broader environments and to 
compare it with other similar systems, as well as improve the efficiency of both the 
classification and recommendation processes. Larger amount of data will be collected 
in subsequent experiments in order to obtain a better evaluation. 
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Abstract. The use of action models for the analysis of control programs 
can be useful for two reasons. First, it promises to deliver better tools for 
the simulation, verification and synthesis of control programs, and second 
it presents challenging problems for theories of action and knowledge. In 
this paper we use a theory of actions and knowledge developed elsewhere 
to analyze control programs for navigation tasks. We model both physical 
and sensing actions and establish conditions under which different control 
programs are executable and lead the agent to the intended goal. 



Motivation 

Consider a rectangular environment with an agent trying to reach a goal object . 
The agent has a position and orientation and can either move forward or rotate. 
He has also sensing capabilities that allow him to determine the position of the 
target object within certain constraints: e.g., that no other object is on the way, 
that the distance or angle to the goal object is within his reach, etc. 



Program 1 Rotate and Move to Goal 



reached{goal) done 
see{goal) A facing{goal) move 
see(goal) rotate 



Intuitively if these constraints are met, a simple control loop expressed as 
the sequence of condition-action pairs in Program 1 should lead the agent to the 
goal. Methodologies for building agent control programs of this form have been 
proposed by Brooks [4], Nilsson [14] and others. Such programs are normally 
tested on simulated worlds or in the real world. Here we aim to show that such 
programs can also be tested over sufficiently rich action models. An action model 
is a description of the effects of actions on both the envieonment (e.g., [8]) and 

Helder Coelho (Ed.): IBERAMIA’98, LNAI 1484, pp. 183-194, 1998. 
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the agent’s internal state or knowledge [12,15]. Since a control program is a 
mapping from knowledge to actions, an action model can be used to predict 
whether a control program is executable and whether it will lead the agent to 
its intended goal (see also [11,2]). 

An action model is a good standpoint for the evaluation of the executability 
of a program, determining if the constraints associated with the actions are met, 
the conditions for the execution of the actions are achieved, and if such condi- 
tions can be evaluated. In some agent programming approaches the executability 
is guaranteed by imposing restrictions on the evaluation mechanism [13], estab- 
lishing consistency rules [17] or providing mechanisms of detection of abnormal 
situations and fallback [5]. The verification that the program will lead the agent 
to its goal is also possible given an accurate action model that supports the 
assumption that the goal condition is reachable and that the actions performed 
will eventually make true that condition. 



Action Theories 

The model for actions below is a variation of a model reported in [3], which in 
turn is an extension of [8]. It comprises a language for describing actions in the 
form of a logical theory, and a semantics that maps such theories into sets of 
state trajectories sq, ^i, 52, ... where each Si represents the state of the world at 
time i. The modeling language is built up from constant, function and predicate 
symbols distinguished by certain semantic properties: 

1. fixed symbols have denotations that are fixed and known across time 

2. fluent symbols have denotations that tend to persist 

3. action symbols are used to denote actions 

Fixed symbols include symbols like ‘3’, ‘+’, etc., whose denotation is 

fixed and standard, and other symbols that we call identifiers that we regard as 
self-denoting. 

For simplicity we assume that action symbols are propositional symbols de- 
noting actions, and finents are either constant or function symbols of arity 
one. Thus a relation like on{x,y) in the blocks world is modeled by an equal- 
ity like loc(x) = y. Moreover, each functional finent / will have a type of 
the form Df Rf meaning that the function denoted by / takes elements 
from Df and maps them into Rf. The finent loc for example may have a type 
BLOCKS R, where BLOCKS is a set of block identifiers {block.A, block.B, 
etc) and R is the real line. The domain Df of the functions denoted by functional 
finents is assumed to be given by a set of identifiers. Terms, atoms, and formulas 
are defined from the constant, function, and predicate symbols in the standard 
way, except that the only terms of the form f{t) when / is a finent are the ones 
in which t belongs to 17/. Such terms, as well as the constant finent symbols, 
are called fluent terms. 
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Action Rules The action rules are rules of the form 

AaC ^ L 

where A is an action^ C is a formula not involving action symbols, and L is an 
assignment of the form F := t^ where F is a fluent term and t is a term. For 
example, the action rules: 



hit A h > A ^ h := h — A 
hit Ah < A ^ h := 0 

say that after hitting a nail with a hammer, its distance to the wall decreases 
by a fixed constant A if its original distance was greater than Z\, and to zero 
otherwise. 



Definitions 

New terms and atoms can be introduced in the language by means of defini- 
tions. For example, the atom c\ose{hlockAi) can be defined to be true just when 
\pos{hlock-A)\ < 0.1 is true, and similarly the term dist{hlock-A) can be defined 
as the value of the term pos{block^A) — pos(me). New terms t are defined by 
expressions of the form: 

meaning that the denotation (value) of t is the denotation of the term t' . Simi- 
larly, new atoms p can be defined by expressions of the form: 



p if A 

Defined terms and atoms can be used anywhere as long as they do not intro- 
duce circularities in the definitions.^ 

An action theory T is a triplet of the form (D^A^O)^ where D is a domain 
theory containing action rules and definitions, A is set of timed actions, and O is 
a set of observations. A timed action is an expression of the form p[i], where p is 
an action symbol and i is a time point (a non- negative integer). An observation 
is an expression of the form F[i], where F is a non-action formula and i is 
a time point. For example, the rules above about hitting a nail together with 
the actions A = {hit[0],hit[l]^hit[2]} and the observation h[0] = 5 constitute 
an action theory. The semantics of such theories is given below. Provided that 
the value of the fixed symbol Z\ is 2, this action theory for instance yields the 
conclusion h[3] = 0. 

^ Circularities in the definitions mean circular chain of dependencies, where a defined 
expression depends on a second defined expression when the second appears in the 
definition of the first. 
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Semantics 

The semantics maps an action theory into a set of state trajectories sq, si, 
. . . where Si stands for the state of the world at time i. 



States 

Each state s is an interpretation over a domain D that assigns a denotation 
to each expression x in the language in the standard way: 

— G T), if c is a constant symbol 

— /^ G ^ T), if / is a function symb. with arity n 

— [/(t)]* = r(e) 



For standard symbols x (numerals, arithmetic operators and predicates, etc.), x^ 
is the standard denotation of x, while for identifiers x^ = x. 

For terms t defined as t = t', is defined as t'®, while for atoms p defined as 
p = A, p^ is defined as . 

Similarly, for atoms p defined by a collection of clauses of the form p if A, 
p^ is true iff A^ is true for some such clause. A state s satisfies a formula F, 
written s |= F, if F^ = true. 



Trajectories 

A trajectory t is an infinite sequence of states sq, si, . . . , ... over a common 

domain D. 

A trajectory sq, 5i, ... is admissible relative to a given domain theory if the 
changes in every transition si to for i > 0, are supported by the rules; i.e., 

_ fsi+i = fSi if f := t is supported in else 

- /^"+^(c) = if /(c) := t is supported in else /^"+^(c) = /^"(c) for each 

ceDf. 

An assignment f{t) := t' (f :=t) is supported in a state Si when for some rule 
F ^ f{t) := t {F ^ f := t) its antecedent F is true in 



Models 

The models of an action theory T = (D^A^O) are the admissible trajectories 
(relative to D) that are compatible with both the actions A and the observa- 
tions O. A trajectory sq, si, . . . is compatible with the observations if for each 
expression F[i] G O, F^^ is true, and is compatible with the actions A iff for each 
action symbol p, p^^ is true iff p[i] G A (i.e., actions not in A are assumed to be 
false). 

For the example above, it is simple to check that a transition from a state Si 
to a state is admissible if: 




Analysis of Agent Programs Using Action Models 187 



— when hit^^ is false or = 0, 

— = 0, when hit^^ is true and < Z\, or = 0 

— — Z\, when > A and hit^^ is true 

In the resulting models sq, si, . . . , of the theory = 5, =3, 

= 0. 

Control Theories 

Control theories are similar to action theories except that actions are replaced by 
control programs. In other words, a control theory has the form C = (D, P, O), 
where D and O are as above, and P is a control program. A control program is 
a finite sequence of condition- act ion pairs: 

Cl ^ ui ; C2 ^ U2 ; C3 ^ as ; ... 

where each is an action and each is a formula which does not involve any 
actions. That sequence is evaluated from scratch at every time point i > 0, and 
the first action whose condition is true is executed. Later on we will consider 
two conditions such programs must satisfy: namely, the agent must have the 
knowledge to evaluate the conditions q, and if q is the first condition that 
evaluates to true, the preconditions of ai must be true as well. The models of a 
control theory T = (D^P,0) are the admissible trajectories (relative to D) that 
are compatible with both the observations O and the program P where: 

Definition 1. A trajectory sq, si, .. .is compatible with program P if for each 
state Si and each action a, a is true in Si if and only if a = Oj and Cj aj is 
the first condition- action pair in P whose condition Cj is true in Si. 

Going back to the example above, it is simple to check that the effect of the 
actions hit[0]^ hit[l] and hit[2] can be achieved by the simple program h>0^ hit. 
Moreover, the program will achieve the effect h = 0 for any initial value of h as 
long as it can evaluate the condition h > 0 (see below). 

Action Constraints 

So far we have ignored that actions often have preconditions which may prevent 
the action to be executed. For example, a agent cannot move forward when facing 
a close obstacle, he cannot pick up an object if he does not have an empty hand, 
etc. We accommodate preconditions by extending action and control theories 
with a fourth component: action constraints. Action constraints are expressed 
by formulas of the form: 

aD C 

where a is an action, and C is a formula expressing a precondition for a. 

The semantics of action constraints is very simple and follows the Strips 
model [6]: an action theory is executable when no model of the theory contains 
a state Si that violates an action constraint. For example, if move D clear_front 
is a constraint, a theory in which a move is performed when clear.front is false 
is non- executable. 
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Knowledge 

The executability of control theories is a bit more subtle than the executabil- 
ity of simple actions as the agent has to be able to evaluate the conditions in 
the program. A condition- action pair may be facing {goal) move^ yet the 
agent may not know whether it is facing the goal or not (say because of limited 
visibility, presence of obstacles, etc). In such case the program is not executable. 

In order to characterize the executability of programs we need to model what 
the agent knows. For that purpose we introduce an intensional operator K so 
that K{x) for a term or (objective) formula x means that the denotation of x 
is known (see [7] for details, and [12] and [15] for related approaches). The 
knowledge of the agent will be grounded in the definitions in the theory and 
in the expressions x whose denotation are observable (e.g., the time of the day 
is observable when looking at the clock). ^ We express that the denotation of a 
term or formula x is observable to the agent by writing obs{x). The conditions C 
that make certain terms or expressions x observable are encoded by means of 
defining clauses of the form (see Section 2) 

obs{x) if C 

that indicate that obs{x) is true when some such formula C is true. The truth 
of the epistemic expression K{x) in a state s is determined by the denotation 
of X in all states s' that are possible from s given the definitions and observables 
that are true in s (i.e., the expressiones x s.t. obs{x) is true in s):^ 

Definition 2. K{x) is true in s iff x^ = x^ for all states s' that are acces- 
sible from s, where s' is accesible from s if for all expressions x observable 
in 5, x^ =x^ . 

For example, if the atom high is defined in the theory as h > 10, the truth of 
the atom high will be known if the fiuent h is observable (i.e., K{high) follows 
from obs{h)). This is because for all states s' accessible from s we must have 
(because h is observable), and hence (h>10)^ =(h> 10)^ 

More generally all observable expressions are always known, and if an expres- 
sion X (atom or term) is defined in terms of expressions y that are known, x will 
be known as well. Provided with this model of knowledge, the executability of a 
program relative to a given domain theory can be characterized as follows: 

Definition 3. A program is executable if in every state S{ of every model sq, si, 

. . . 1) all action constraints are satisfied, 2) all conditions cq, ci, . . . , Cj up to 
and including the first condition that is true in Si are known. 

In all programs that we consider, the action ai associated with the first 
condition ci will be the special action done that has no effects. We will be able 
to say that a program P achieves a formula (goal) G if in every model the action 
done becomes true at some point, and at such point G is true. 

^ The model of knowledge below is a simplification of the model in [3] which assumes 
that the agent also knows the rules in the theory. 

^ Notice that all states automatically satisfy the definitions. 
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Analysis of Simple Navigation Programs 

Situation 1 

We are ready to analyze the Program 2, an extended version of the program 
discussed in Section 1. For that we need to model each of the conditions and 
actions involved in the program. 



Program 2 Amble and Move to Goal 



reached{goal) done 
see(goal) A facing(goal) move 
see{goal) rotate 
-^blocked move 
true ^ rotate 



First we assume a set of object identifiers OBJ^ containing the identifier goal 
for the goal object, and a coordinate system with origin in the bottom- left corner 
of the rectangular environment shown in Figure 1. The position and orientiation 
of the agent will be represented by three fluents xpos, ypos and angle. Following 
the denotation suggested by Latombe in [9] and assuming that the agent is a rigid 
free- flying object; xpos and ypos are the coordinates of the center of the agent 
with respect to the origin of the environment, and angle is the angle between 
the x-axis of the agent and the environment, restricted to the interval (0,27 t] 
with modulo 27 t arithmetic. The effects of the two actions on these fluentes are 
captured by the rules: 

rotate angle := angle + dangle 
move xpos := xpos + S di st cos{ang\e) 
move ypos := ypos + 5 di st sm{ang\e) 



where dangle and ddist are two known constants standing for the angular and 
linear step sizes. The agent has no information about the absolute location of 
the objects (or itself), yet it can determine the relative positions dist{obj) and 
angles angle{obj) of the objects obj he can ‘see’ (this is the so-called indexical 
information [10]). Given that the absolute position of the objects is captured by 
the fluents xioc and yloc, their position and angles relative to the agent can be 
defined as follows: 



dist(obj) (xpos — x\oc{obj))^ + (ypos — yloc(o6j))^ 



angle{obj) angle + 0{obj) 

.. , .. def . _i ypos - y\oc{obj) 
= ™ d,snobj) 
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where ohj stands for each of the identifiers in the set OBJ. 

With these definitions, some of the conditions in the program can be modeled 
by the schemas: 



facing (obj) angle{obj) < dangle 
reached{obj) ^ facing (obj) A close{obj) 
close{obj) dist(obj) < Sdist 

The condition see{goal) in the program is special because it aims to model the 
perceptual machinery of the agent. If we want to predict how the agent is going 
to behave, we have to provide a model for the condition see."^ A very simple 
model assumes that the agent is going to see any object that is within a certain 
maximal distance (vdist) and a certain maximal angle (vangle): 

see(obj) dist(obj) < vdist A angle(obj) < vfield 

We assume that the effect of ‘seeing’ an object is to make its relative distance 
and an angle observable: 



obs{dist{pbj)) see(obj) 
obs{angle{obj)) see(obj) 

In addition, the agent always knows whether it’s seeing an object or not: 

obs{see{obj)) true 

Finally we assume that the action move has the precondition 

move D -^bloeked 

where bloeked is defined by the collection of clauses: 

blocked if facing (obj) A close{obj) 

obtained by replacing obj by each identifier in OBJ ^ including the walls that are 
modeled as abstacles. 

We are ready to prove that under some conditions, the program (2) will 
be executable and will achieve the goal reaehed{goal). The conditions that we 
assume are: OBJ = {goal} (no other object but the goal; in particular no walls or 
obstacles), vfield = 360*^ (full visibility in all directions), initially dist(goal) < 
vdist (goal object is initially within the linear visibility range). Two other natural 
conditions that we assume are dangle < v field /2 and ddist < vdist. 

^ Note that see is modeled as a ‘defined condition’ rather than as a ‘knowledge gather- 
ing action’ as in [15]. In this way, the agent continously gathers information from its 
surrounding (when certain conditions hold) without requiring its active participation 
in the form of deliberate actions. 
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We prove first that the action constraint move D -^blocked is satisfied in all 
states Si of every model 5 q, si, ... of the resulting control theory. Indeed, since 
there is only one object goal^ blocked is true iff heading{goal) and close{goal) 
are true. Yet that means that reached{goal) is true, and then, that move is 
false. Thus, move D ^blocked is always satisfied. We prove now that the agent 
has always the knowledge to evaluate the conditions in the program. Let us 
prove first that the truth of reached{goal) is always known. We consider two 
cases: when see{goal) is true and when see{goal) is false. In the second case, 
reached{goal) must be false because the restrictions dangle < v field/2 and 
ddist < vdist guarantee that ^see(goal) implies -^close(goal) V ^facing {goal). 
Likewise, in the first case, dist(goal) and angle{goal) must be known, and hence, 
the atoms close{goal) and facing{goal)^ as well as reached{goal)^ must be known 
too. As a result, we get that if see{goal) is false, reached{goal) must be false, 
and if see{goal) is true, reached{goal) must be known. Since the state of the 
atom see(goal) is always known, this means that the state of reached[goal) 
will be known too. Similar arguments suffice to prove that the other conditions 
{facing{goal) and ^blocked) will also be known. 

We are left to show that, under the assumptions above, the program will 
lead the agent to a state where reached[goal) is true. Actually, under those 
assumptions see{goal) must be initially true. Furthermore, since there is full 
angular visibility, and the agent only moves in the direction of the goal, once 
see{goal) becomes true, it remains true throughout. That means, among other 
things, that the agent cannot be rotating forever as, eventually, the condition 
facing{goal) will become true. Yet once this condition becomes true, the agent 
will move towards the goal, and every time the agent moves, dist(goal) will 
decrease This does not mean, however, that once facing {goal) becomes true 
it remains true until goal is reached. Yet, even if facing {goal) becomes false, 
the distance to the goal decreases and the program will bring back the agent to 
a state where facing {goal) is true again. Thus, eventually, in a finite number of 
steps, reached{goal) will be true. 



Situation 2 

If the assumption that the visibility range is 360^ is changed and a limited range 
is used instead, the Program (2) remains executable but does not necessarily 
lead the agent to the goal. This is because, in the new context, it is no longer 
true that once see{goal) is true, it remains true throughout. Indeed, if the agent 
is not facing the goal, it will rotate, and at certain point see{goal) may become 
false. At that point, the agent will move away from the goal and won’t come 
back (since we are assuming that there are no other objects such as walls). 

There is however a simple modification to the program that avoids this prob- 
lem. It consists in the introduction of two different rotations, a left rotation and 
a right rotation. Under the conditions above, even with a limited visibility range, 

^ this involves some trigonometrical arguments based on the fact that the maximum 
dangle must be 60° 
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Program 3 Specialization of the action rotate 



reached{goal) 
see(goal) A facing{goal) 
see(goal) A on — left{obj) 
see{goal) A on — right (obj) 
-^blocked 
true 



done 

move 

rotate-l 

rotate-r 

move 

rotate-l 



the new program leads the agent to the goal. The action rules for the two new 
actions are: 

rotate-r angle + dangle 

rotate-l angle — dangle 

and the definitions of the new conditions are: 



on-left{obj) = 0 < angle{obj) < v field/2 
on-right{obj) (360 — v field/2) < angle{obj) < 360 



Situation 3 

In the presence of other objects, i.e., when OBJ includes other objects besides 
goal, the situation changes significantly. First of all, the program (1) is not 
executable, as in certain situations a move can be triggered in situations in 
which the precondition ^blocked is false. This however is easy to fix: we just 
need to introduce the condition ^blocked among the conditions for move. 

The other problem (or feature!) is that the way that the model defines 
see{obj) implicitly assumes that all objects are transparent; namely, the visi- 
bility of the agent (i.e., the definition of see{goal)) depends only on the linear 
and angular distance to the goal; without taking into account the presence of 
objects that can be on the way. 

This all implies that if see{goal) is always true (say because the param- 
eters V field and vdist are sufficiently large), the agent will move only when 
f acing{goal) and -^blocked are both true, and will rotate otherwise. However, 
when the obstacle in the way is perpendicular to the line that joins the agent 
to the goal, rotations alone cannot establish the conjunction facing{goal) A 
-^blocked (because in such arrangement facing {goal) will be true exactly when 
facing {obst) is true, where obst is the obstacle identifier). Thus in this case, the 
program will be make the robot rotate forever. A solution to this problem can 
be obtained by replacing the third line of the Program 2 with the line: 

see(goal) A ^willJblock rotate 
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where willJblock^ defined as:^ 

will-block if close{obj) A angle{obj) < 25angle 

detects rotations that will make the condition blocked true, and in those cases, 
make the agent take a step away from the obstacle. 

Using the resulting program the agent exhibits the typical behavior of a fly 
trying to get pass a window The correctness of the resulting program is slighlty 
complex and depends on some topological constraints on the set of objects (the 
presence of separations among the objects, etc). 

The modeling of ‘opaque’ objects as opposed to transparent objects requires 
a redefinition of see{obj) to: 

see{obj) dist{obj) < vdist A 

angle{obj) < vfield(obj) A ^occluded{obj) 

where the occluded predicate must be defined in terms of the ‘cells’ occupied by 
the objects in OBJ and the location {xpos,ypos} of the agent. In the presence 
of such objects, the agent can be trapped in trajectories in which the goal object 
is never visible, and thus, in which the program will not lead the robot to the 
goal. This can be seen using the simulator built to experiment with the theories, 
models and programs posed in this work. 



Conclusions and Related Work 

We have used a theory of actions and knowledge to analyze control programs for 
navigation tasks, modeling both physical and sensing actions and establishing 
the conditions under which different programs are executable and lead the agent 
to the goal. The use of action models can also be useful for the construction of 
control programs. Nilsson’s [14], for example, advocates a methodology for writ- 
ing teleological control programs in which the actions in one line are supposed to 
contribute to the realization of the conditions in the preceding lines. This design 
criterion can be formalized in this framework as: 

Definition 4. A program P is teleoreactive if in all models sq, si, S 2 , 
every state Si that makes ej aj for j > 0 the first applicable condition- action 
pair in Si, is followed after a finite number of time points by a state Si-^/\ that 
makes ek the first applicable condition- action pair where k < j. 

Our analysis shows clearly that, even in simple programs as program 1, that 
satisfies the Universal property of TeleoReactive programs [13], there are condi- 
tions that violates the TeleoReactive principle, since once the robot start mov- 
ing, faeing{goal) may become temporarily false. However, these execution errors 
won’t prevent (in this case) that the goal condition will ultimately be achieved. 

® For the condition will-block to be known, 25angle should be smaller than v field. 

^ This can be observed using the simulator available at 
http://www.ldc.usb.ve/~92-24791/TR/; snapshots were not included due to 
space limitations. 
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For this type of models to be useful for the analysis of real robot plans, how- 
ever, both richer control structures (e.g., see [11,7,2]) and uncertainty 
(e.g., [1,16,7]) would need to be accommodated. We plan to explore some of 
these issues, as well as the automatical construction of robot plans, elsewhere. 
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Abstract. This paper presents an extension of Bayesian networks (BN) applied 
to reliability analysis. We developed a general methodology for reliability 
modeling of complex systems based on Bayesian networks. A reliability 
structure represented as a reliability block diagram is transformed to a Bayesian 
network representation, and with this, the reliability of the system can be 
obtained using probability propagation techniques. This allows for modeling 
complex systems, such as a bridge type, and dependencies between failures, 
which are difficult to obtain with conventional reliability analysis techniques. 
The relation between a BN and fault tree, and some advantages of BN for 
modeling system reliability are shown. We present some examples of the 
application of this methodology in solving difficult cases, which occur in 
reliability analysis of power plants. 



1. Introduction 

Complex industrial plants and equipment for eritieal applieations, sueh as power 
plants, require a high reliability, i.e., a very low probability of failure. For this, there 
are statistieal teehniques that ean prediet the reliability of a eomplex system based on 
its strueture and the reliability of eaeh eomponent. Some traditional teehniques for 
reliability analysis have several important limitations, ineluding the assumption that 
all the failures are independent and that the rate of failure is eonstant (exponential 
model). Also, building the model used to caleulate the reliability of the system is a 
diffieult and eomplex task, so an expert reliability engineer is usually required. 

In general, failure predietion is a diffieult problem. However, for a given time 
period (mission time), the probability of failure ean be obtained by applying 
probability theory. In the eontext of this work, reliability is the probability that the 
equipment performs its intended fimetions satisfaetorily or without failure, for a 
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mission time, under specific design and environmental conditions. The reliability of 
complex equipment depends on the individual reliability of its elements. 

The motivation for developing this work is to obtain a computational method that 
can incorporate explicitly dependencies between failures and include the effects of 
maintenance in the reliability analysis of complex systems in operation. A Bayesian 
network is used to represent the system reliability structure, and obtain its reliability 
via probability propagation. With this representation the limitations of other 
techniques are avoided, so it is possible to manage dependencies and non-exponential 
distributions. 

This paper is divided in seven parts. The second part summarizes the theory of 
Bayesian networks, and the third one, general aspects of reliability analysis. The 
fourth part focuses on dependency between failures in reliability analysis and the fifth 
part presents a procedure for systems reliability modeling supported by Bayesian 
networks. The following part presents an application to reliability analysis of power 
plants. Finally, the conclusions and future work are presented. 



2. Bayesian Network 

Bayesian networks are directed acyclic graphs (DAG), see figure 1, in which the 
nodes represent propositions (or variables), the arcs signify direct dependencies 
between the linked propositions, and the strength of these dependencies are quantified 
by conditional probabilities [8]. Such graphical structures, known also as belief 
networks, are used for representing expert knowledge. The graph represents a set of 
random variables and its dependency and independency relations. It is used to 
estimate the posterior probability of unknown variables given other variables 
(evidence), through a process known as probabilistic reasoning. This generates 
recommendations or conclusions about a particular problem, and can be used for 
explanation, the process of communicating the relevant information to the user. 




Fig. 1. Example of a directed acyclic graph 
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2.1 Probability Propagation 

The topology of a Bayesian network represents the dependency relations between the 
variables implicated. It represents which variables are conditionally independent 
given another variable. Following figure 1, C is conditionally independent of D, given 
^if: 



P(C\D,E)=P(C\E) 

An advantage of Bayesian networks is that they provide a compact representation 
of the joint probability distribution of the variables. This probability can be expressed 
as a product of the conditional distributions of each node given its direct influences 
(parents) in the graph. Hence, letting pa(t) denote the parents of node t, the graph 
implies that the joint distribution P(T) has the form: 



P{T) = Y\p{}\pa(t)) ( 1 ) 

^gT 

This is also known as a recursive model with respect to some DAG. Thus, for 
example, the model for figure 1 is equivalent to: 

P{A,B, C,D,E,F) = P{F\E?iP{E\C,DyP{D\AyP{C\AyP^^ (2) 

The reasoning mechanism is called probabilistic reasoning. It consists in 
instantiating the input variables (symptoms or evidences) and propagating their effect 
through the network to update the probability of the hypothesis variables. The 
propagation procedure is based on Bayes theorem and the structure of dependencies 
of the network. 

Propagation in trees. A tree structured network has only one node, called root node, 
without parents and the rest of the nodes have only one parent. 

In a tree, any node (Q can be a point of division in two independent sub-trees. A 
sub-tree contains as root the node of division and is denoted by (-), the data contained 
in this sub-tree represents the evidence F, the remainder of the tree is denoted by (+) 
with evidence V" [7]. Therefore, the posterior probability of any variable (Q can be 
obtained by Bayes theorem as: 

P{Q\ F)=F(QF( V", F| Ci)/F( V) (3) 

But since both sub-trees are independent, and with Bayes theorem further applied, we 
have: 
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P{Ci\ V)=^P{Ci\ V+)P{V-\Ci) (4) 

Where a is a normalization constant. If we define: 

%{Ci) = P{Ci\V+) (5) 

X{Ci)=p{y-\Ci) ( 6 ) 

Replacing the equations (5) and (6) in (4), we obtain: 

P{Q\V) = ^ n{Ci)X{Ci) (7) 



The above equation offers a way to update the probabilities of any node C as a 
product of the predictive evidential support (tt) from all non-descendant nodes of C 
mediated by C’s parent, and the retrospective evidential support (k) from C’s 
descendants. 

The propagation procedure can be implemented through communications between 
neighboring nodes, by local operations, and by sending messages between connected 
nodes in the network [9]. 

Propagation in polytrees or simply networks connected. A polytree is a network 
in which a node can have more than one parent, without multiple paths between nodes 
(figure 2). 




Fig. 2. Example of a polytree 

The propagation of probabilities in polytree structures is very similar to the case of 
tree networks [7]. The principal difference is that polytrees require the conditional 
probability of each node given all its parents nodes. In a similar way that for tree 
structured networks, for the case of polytrees an expression to obtain the probability 
of any node given some evidence can be deduced [9]. 
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Consider a typical fragment of a singly-connected network, consisting of a node Bf, 
the set of all parents of Bf, F^={Vj^, and the set of all children of Bf, V={Vj\ 

As before, let Fbe the total evidence obtained, so that: 

P{Bi\V) = aF{Bi\v;,-V„")P{V;\Bi) P {V^\Bi) ( 8 ) 

Dividing the polytree in two parts, and V ', it is possible to obtain a mechanism 
for local probability propagation similar to the one for trees. 

The algorithm for probability propagation in polytrees is very efficient, so the 
computation time required is nearly proportional to the diameter (largest path) of the 
network. For multiconnected networks probability propagation is more complex and 
there are several algorithms based con clustering, conditioning and stochastic 
simulation [9, 7]. 



3. Reliability Analysis 

In reliability analysis, we can distinguished three characteristic types of failures which 
may be inherent in the behavior of the equipment [1]. First, there are the failures 
which occur early in the life of a component. These are called early failures and in the 
majority of the cases are the result of a poor manufacturing and quality control 
techniques during the production process. Second, there are failures which are caused 
by wear out of parts. These occur in equipment only if it is not properly maintained or 
not maintained at all. Third, there are the so called ’’chance" failures. These failures 
are caused by sudden cumulative stress beyond the design strength of the component. 
Chance failures occur at random intervals, irregularly and unexpectedly. 

Reliability analysis differentiates between early, wear out, and chance failures for 
two main reasons. First, each one of these types of failures follows a specific 
statistical distribution and therefore requires a different mathematical treatment. 
Second, different methods must be used for their elimination or correction. 

In reliability analysis of a complex system, is nearly impossible to model the 
complete system. The logical process to accomplish this is to divide the system in 
smaller elements, units, subsystems, or components. The main assumption is that 
every entity has two states, success and failure (although some times three or more 
states are needed). The subdivision generates a “block diagram” that is similar to the 
description of systems in operation [3]. The models are then fixed to this structure, 
and they utilize probabilistic techniques to calculate the reliability of the system in 
terms of the reliability of the subdivisions [10]. 
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To evaluate the adequate performanee, an observation of inadequate performanee 
in operation is required, therefore, the frequeney at whieh malfunetions and failures 
oeeurs it is used as a parameter for a mathematieal formulation of reliability. This 
parameter is ealled failure rate\ it is usually measured in number of failures per unit 
operating hour. Its reeiproeal value is ealled the mean time between failures and this 
is measured in hours [1]. 



4. Dependency between Failures in Reliability Analysis 

Our objeetive is to build a versatile computational tool capable of evaluating the 
reliability of complex systems during its useful life or wear. Traditionally, fault trees 
[3] are used for reliability analysis. However, this technique has its limitations. It 
usually assumes independent events and it is difficult to model dependencies between 
events or faults. 

Dependent events can be found in reliability analysis in the following cases: 

1) Common causes. A condition or event which provokes multiple elemental 
failures is called a common cause. For instance, fire or flood may cause 
simultaneous failures of sets of components. Thus, under these conditions, 
component failures are no longer independent. Other sources of common cause are 
aging, human error and system environment. 

2) Mutually exclusive primary events. Consider the basic events: “switch fails to 
close” and “switch fails to open”. These two basic events are mutually exclusive, 
i.e., the occurrence of one basic event precludes the other. Thus, we encounter 
dependent basic events when a fault tree involves mutually exclusive basic events. 

3) Standby redundancies. When an operating component fails, a standby 
component is put into operation, and the redundant configuration continues to 
function. Thus, components failures are not statistically independent, since the 
failure of an operating component causes a standby component to be more 
susceptible to failure. 

4) Components supporting loads. Assume that a set of components supports loads 
such as stress, current, etc. A failure of one component increases the load 
supported by the other components. Consequently, the remaining components are 
more likely to fail, and we can not assume statistical independence of these 
components. 

Bayesian networks allow to represent explicitly dependencies between failures as 
above mentioned. We suggest to employ this approach to solve reliability analysis of 
complex systems, in particular when there are dependent failures. 
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5. Procedure for System Reliability Modeling 



The procedure for reliability analysis based on Bayesian networks consists in defining 
the conditional probability matrix equivalent to the series and parallel configurations 
of simple systems, as the AND/OR gates utilized in fault trees. Being the reliability 
block diagram a methodology commonly used for reliability analysis, we will refer it 
to introduce the representation with BN. Reliability analysis begins with the 
construction of a reliability block diagram of the system. This is a graphic 
representation where every component is represented as a block or rectangle 
connected to other components, in series or in parallel form. 



Considering a series or parallel system with only two components, figure 3, its 
representation as a Bayesian network is shown in figure 4, with one additional node, 
vT. We use circles for representing series systems and squares for parallel systems. The 
X node is a binary variable that represents the system state, success or fault. 




(a) (b) 



Fig. 3. System reliability block diagram: (a) series, (b) parallel 




Fig. 4. Bayesian network for two components: (a) series system, (b) parallel system 



According to equation (1), the joint probability of the series system is: 

P(X,A,B) = P(X/A,B) P(A,B) (9) 

where the elements of the columns of the conditional probability matrix, P(X/A,B), are 
the combination of the parent nodes states: A,B are operating, A is operating and B is 
failed, A is failed and B is operating, and A, B are failed. The first row represents the 
success probability of the system given the information of A and B. This matrix is 
equivalent to an AND gate used in fault trees: 
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P{X!A,B) 



10 0 0 
0 111 



( 10 ) 



The elements of the P(A,B) matrix are taken from the marginal probability P(A) and 
P(B), for example P(a ' b) is the probability that component ^ is in a failure state and 
component B in an operating state. 



P{A,B) 



a b 
a'b 
aB 



( 11 ) 



In the parallel case only the conditional probability matrix is modified, such matrix is 
equivalent to an OR gate [11]: 



P{X!A,B) 



1110 
0 0 0 1 



( 12 ) 



Following the above scheme, the generalization for multiple components is not 
difficult. For instance, for a three component system, which requires at least two 
components functioning, the system representation using a BN is shown in figure 5, 
and the probability matrix will be: 



P(X/A,B,C) = 



1111 0000 
0000 1111 



(13) 



Thus, for simple systems (series/parallel combination without dependent failures) we 
obtain a BN that has an inverted tree structure (polytree). This is equivalent to a fault 
tree and will give the same results. However, the BN model could be extended to 
represent more complex systems, that are difficult, if not impossible, to model with 
fault trees. 




Fig. 5. Bayesian network for a series 3 component system 
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6. Applications to Reliability Analysis 

6.1 Complex Combination of Series-Parallel System 

In order to exemplify the advantages of using BN, consider the schematic reliability 
block diagram in figure 6. This system is known as bridge type. The system is 
operable if at least one of the paths AC, BD, AED o BEC is good. 




Fig. 6. Reliability block diagram 

The usual method to compute the reliability of a bridge system is selecting a 
component, and consider two alternatives: the component is working (good) or the 
component has failed (bad) [1]. In this case the E element is chosen, which is the best 
choice to simplify the solution. The system is divided in two subsystems, one when E 
is considered as good and other where E has failed. 

When a set of subsystems are defined utilizing series-parallel connected 
configurations, the total reliability could be evaluate applying Bayes’ theorem. The 
probability of success of the complete system in terms of conditional probabilities is 
P(X) = P(X/E= good) P(E=good) + P(X/E=bad) P(E=bad). 

The previous method is laborious. However, using a BN approach the solution is 
simplified so the system reliability can be obtained from single network. A graphic 
representation for the bridge system in the scheme of BN is shown in figure 7. 




Fig. 7. Bayesian Network of a complex system 
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For example, if the A,B,C,D, and E eomponents have 0.1 as suecess probability 
value, the system reliability is 0.9785 and its failure probability 0.0215. To obtain 
these values, we apply probability propagation teehniques (in this ease, it is a 
multieonneeted network) to the network in figure 7, obtaining the probabilities for the 
intermediate nodes (Sj) and for the eomplete system (P). For this partieular ease, the 
results for the subsystems are: 

P(Si)= P(S4)=(0.729, 0.271) 

P(S2)= P(S3)=(0.81 0,0.190) 

We have developed an algorithm for building automatieally a BN representation 
from the reliability bloek diagram [6]. 



6.2. Reliability of Dependent Components 

Suppose three independent sourees of shoek are present in the environment [2]. A 
shoek from souree 1 destroys eomponent 1; it oeeurs at a random time f/y, where 
P\U^ >t\ = . A shoek from souree 2 destroys eomponent 2; it oeeurs at random 

time U 2 , P[U 2 > = . Finally a shoek from souree 3 destroys both components, 

it occurs at random time Uj 2 , where P\u ^2 . Thus the random life length 7/ 

of component 1 satisfies: 

=min(U,,U,2). 

while the random life length T 2 of component 2 satisfies: 

T 2 = min(U2,U^2) 

A BN model for this example of dependent failures is shown in figure 8, where Sf 
represents the i-th source and Cj the i-th component. The system states are assigned to 
X. In this case, all the conditional probability matrix are defined equivalent to AND 
gates, because a series system is considered. 




Fig. 8. Bayesian network of a system with eommon cause failures 
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The reliability and failure probability of the system are obtained by applying the 
conventional procedure for probability propagation for multiconnected networks [7]. 
Reliability results for particular values are shown in table 1 . 



Node 


Reliability 


S, 


0.9417 


S2 


0.9048 


Ss 


0.9980 


Cl 


0.9398 


C2 


0.9030 


X 


0.8504 



Table 1. Reliability (probability of success) for a system with common cause failures 



7. Conclusions 

Bayesian networks are an alternative technique for the systems reliability analysis 
with an ample potential of application. They are based on the management of 
conditional probability and on probability propagation. BN have a strong similarity to 
fault trees. In fact, fault trees could be viewed as a particular case of BN. One of the 
advantages of using Bayesian networks is the explicit representation of dependencies. 

In this paper we have presented a general methodology for modeling reliability of 
complex systems based on Bayesian networks. A reliability structure represented as a 
reliability block diagram can be transformed to a Bayesian network representation, 
and with this, the reliability of the system can be obtained using probability 
propagation techniques. This allows for modeling complex systems, such as a bridge 
type, and dependencies between failures, which are difficult to represent with 
conventional reliability analysis techniques. 

This approach also allows a combination of information sources (objective and 
subjective) and the selection of the best probabilistic model according to the 
distribution and the structure of the system. The combination of information sources 
could be applied to avoid the lack of information in the data bases of certain areas for 
reliability analysis. For example, in the case of the majority of the power plants, the 
information is augmented with the estimates obtained by operators or maintenance 
personal. The combination of these sources permits to increase the precision of the 
system reliability estimation. 
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Another future direetion for researeh is to use this type of models for design. In this 
case, we can set the desired reliability of the system and obtain the required reliability 
of each compnent, using the same probability propagation techniques. 
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Abstract. Most of real AI applications developed under dynamic envi- 
ronments have to interact with the external world, deal with imprecision 
of data and make estimations about the possible data occurrence at 
different instants of time. A temporal model suitable for this type of do- 
mains must provide a representation framework able to capture external 
observations, update this information in the internal state application 
and deduce how these changes influence the application evolution. Rea- 
soning processes for dynamic domains are generally quite complex due 
to the imprecision and variability of data. This usually leads to situa- 
tions where the available time to update all the necessary information 
before processing the following change is not enough. When this occurs 
the internal model is not more consistent with the external world thus 
leading to dysfunctions in the system. 

This paper presents a suitable temporal model for applications running 
under dynamic environments. The proposed framework allows to keep 
the world model consistent with the external world as well as the predic- 
tion of future consequences. All reasoning algorithms are designed as a 
search process between two time-points allowing to obtain approximate 
responses for a temporal query instead of optimal long time-consuming 
solutions. 

Content Areas: Temporal Reasoning, Knowledge Representation 



1 Introduction 

Most of real AI applications developed under dynamic environments are used to 

model the behaviour of applications that interact with the external world. Three 

main features define the behaviour of these applications: 

a) The necessity of relating data to a clock time in order to time-stamp external 
observations with its acquisition date. 

b) The problem of keeping the world model consistent with the external world 
and to predict the possible future consequences that may derive from the 
application evolution. 

Helder Coelho (Ed.): IBERAMIA’98, LNAI 1484, pp. 207-218, 1998. 
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c) The necessity of efficient data management procedures to assert, remove, 

update or retrieve temporal information at any time. 

First item states the necessity of dealing not only with qualitative con- 
straints [9] but also with metric information [2] [3]. This expressiveness some- 
times needs to be augmented to include the possibility of expressing alternative 
situations that may arise in the application evolution. The Temporal Constraint 
Networks (TCN) [3] is the most popular approach to handle with disjunctive 
temporal constraints. But the problem of checking consistency in a general TCN 
is NP-hard [5] and even algorithms for local consistency may result in exponen- 
tial costs. In the particular environment we are concerned to the possibility of 
expressing disjunctive situations is important but this issue can be tackled by 
choosing an appropriate representational framework which allows for efficient 
reasoning algorithms. 

Other approaches as temporal graphs or time-map managers [2] seem more 
adequate for applications which handle large amounts of information and where 
data updatings are frequently produced [10]. This is because algorithms for tem- 
poral graphs do not perform an explicit propagation of the information. In this 
way, the design of uniform search procedures is a more difficult task and there- 
fore a constant response time in a recovery process can not be ensured; but the 
saving time in the assertion of new information compensates the lack of achieving 
responses in constant time. Different reasoning mechanisms have been devised 
for temporal graphs in order to reduce complexity in temporal operations [4]. 

The probabilistic temporal models have been specially developed to deal with 
dynamic applications. Probability is used to represent data under uncertainty [6] 
or to estimate the probability for a certain data to hold at a particular instant of 
time [7]. In this way, data are associated to a certain probability of occurrence 
so the application can form different clusters of information according to this 
probability. However, the inherent complexity of probabilistic inference makes 
impracticable to compute these operations within a certain range of time. Addi- 
tionally, these models require to dispose of an exhaustive knowledge about the 
application behaviour in order to be able to estimate every element which may 
influence in the problem. 

This paper presents a suitable temporal model for applications running under 
dynamic environments. The paper is structured as follows: section 2 presents the 
internal time model, section 3 describes how this model is applied to represent the 
data application, section 4 explains the temporal inference process by means of 
an example, section 5 specifies the reasoning algorithms and section 6 concludes. 

2 Internal Time Model 

The proposed temporal model follows an object-oriented approach based on the 
reified formalisms [8] [1] and is composed of a discrete set of time-points. This 
approach permits to have an explicit representation of time, to separate the 
temporal and non-temporal part of data, to constraint data over the time line 
and to associate data with a temporal interval to represent its validity. 
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2.1 Basic Concepts 

The temporal model uses time-points as elementary primitives to represent the 
beginning and ending of data (temporal facts). 

Definition 1 (temporal fact). A temporal fact tfi is a tuple {oi^ Si^Vi^bi^ei) 
where Oi is a symbol associated to an application object, si is a temporal slot of the 
object which takes different values along the time and Vi is the value associated 
to Si between the beginning point bi and ending point ci. 

By default ei is always after b^. Both bi and stand for the temporal validity 
of value Vi for slot Si. When a new value Vj is acquired for the same slot in the 
object, a new temporal fact is created, tfj = {oi, Si, Vj, bj, ef), and and bj will 
denote the same instant of time, thus indicating tfi precedes tfj. 

Definition 2 (event). An event evi is a tuple evi = {oi, Si,Vi,ti, ovi) where Oi 
is the object and Si the slot subject to the modification incorporated by the event; 
Vi is the new value to he attached to the slot, ti is the time instant at which the 
event is produced and ori represents the event origin, either external or internal. 

Events are used to represent the temporal data evolution and denote the 
changes produced in temporal facts. An external event is a value acquired from 
the external world (for instance through sensors) and an internal event is a value 
generated by the reasoning system which is controlling the application. In the 
following, and for the sake of simplicity, we will not make distinctions in the 
treatment of both types of events. 



2.2 Temporal Constraints 

Time-points are seen as symbolic variables where temporal constraints can be 
posted. A time-point tpi is represented as an interval [li,ri] where li,ri G Z 
represent the earliest and the latest occurrence date for tpi. This interval [li,ri] 
is called temporal window of tpi. Date(tp^) is defined as a function that returns 
the temporal window of time-point tpi. Two additional functions are defined to 
recover the earliest and latest occurrence date of time-point tpp. 

LeftLim(Date(tpi)) = k 
RightLim(Date(tpi)) = 

If left and right limit of tpi are unknown then temporal window is given by 
[— oo,-hoo]. When the occurrence date of tpi is perfectly known, the left and 
right limit refer to the same time instant {li = rj. 

Definition 3 (temporal constraint). A binary temporal constraint is a tu- 
ple {tpi before! after dk tpj) where tpi and tpj denote time-points, G Z and 
afterj before denote the type of temporal relation between both time-points. 

The meaning of a temporal constraint can be stated as follows: 
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— {tpi after dk tpj) indicates that Date(tj9^) must occur after dk units of time 

from Date{tpj). i.e., LeftLim(Date(tpi)) = and RightLim(Date(tpi)) = 

+ 00 . This temporal constraint computes the first instant of time where tpi 
can occur with respect to tpj 

— {tpi before dk tpj) indicates that Date(tp^) must occur before dk units of time 
from Date(tpj), i.e., RightLim(Date(tpi)) = rj+dfc — 1 and LeftLim(Date(tp^)) = 
— oo. This temporal constraint computes the latest instant of time where tpi 
can occur with respect to tpj. 



tpi after 4 tpj 

Date (tpj = [8, +H 

Date (tpj)=[3,3] 

• £-► 

1 1 1 1 1 1 1 1 1 ^ 


tpj before 4 tpj 

Date (tpj) = [-OC, 6] 

Date (tpj)=[3,3] 

• ^ 

1 1 1 1 1 1 1 1 1 ^ 


1 1 1 1 1 1 1 1 1 W 

TPo 1 2 3 4 5 6 7 8 TIMELINE 


1 — 1 — 1 — 1 — 1 — 1 — 1 — 1 — 1 ► 

Tpo 1 2 3 4 5 6 7 8 TIMELINE 



Fig. 1. Temporal constraints 



Two special time-points Tpo and now are used in the model. The former rep- 
resents the initial time of the clock system (Date(Tpo) = [0?0])- represents 
the current time (Date(?7.oic) = [Inow^T^now])^ where Inow = T'now is the number 
of time units elapsed from Date(Tpo) until the current moment. In fact, for a 
time-point tpi^ U and represent the minimum and maximum number of time 
units that must elapse from Date(Tpo) to know the exact occurrence date of tpi. 

Equality relation between two time-points tpj is represented as the con- 
junction of two temporal constraints: tpi after -1 tpj and tpi before 1 tpj. Sym- 
bolic constraints between two time points can be easily represented by setting dk 
equal to 0. In this way, ^Hpi occurs before tpj^^ would be represented by means 
of temporal constraint {tpi before 0 tpj). 

Let c be a temporal constraint of the form {tpi after dk tpj) or 
{tpi before dk tpj)- The following functions are defined over the set of tempo- 
ral constraints: 

— Source(c) = tpj is the time-point reference on which the temporal constraint 
is applied to. 

— Distance(c) = dk is the temporal distance defined in c. 

— Relation(c) = afterj before is the type of the temporal relation in c. 



2.3 Properties of the Internal Time Model 

Some of the most relevant properties of the internal time model are: 

Property 1: finite in the beginning (Tpo before 1 tpi) 
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Property 2: symmetry 



Wtpi^tpj^dk : {tpi after dk tpj) {tpj before — dk tpi) 

ytpi,tpj,dk : {tpi before dh tpj) {tpj after - dk tpi) 



Property 3: transitiveness 

ytpi.tpj, tpk.dk : {tpi before dk tpj) A {tpj before di tpk) 

{tpi before dk di - 1 tpk) 

Property 4: reflexiveness \/tpi^dk > 0 : {tpi before dk tpi) 

Property 5: partial order 

'itpi^tpj^dk : {tpi before dk tpj) Wdi > dk. {tpi before di tpj) 

To denote two time-points are partially ordered {tpi ^ tpj) the following 
temporal constraint can be used: {tpi before 1 tpj). 



2.4 Time-Point Representation 

Definition 4 (time-point). A time point tpi is defined as a tuple {li^ri^Si) 
where k and Ti stand for the earliest and latest oeeurrence date of tpi and Si is 
the set of all temporal eonstraints posted on tpi. Si is defined as a disjunetion 
of sets of temporal eonstraints {Sn, Si 2 . . . . .Sin) where eaeh Sik represents a 
possible temporal oeeurrence for tpi . 

Definition 5 (group of constraints). A particular set Sik ^ is defined as 
a group of temporal constraints over tpi. A group of temporal constraints Sik 
defines a temporal interval in the following way: 

LowerBound(5'ifc) = ^ Z, 

a = maXc(Distance(c) + LeftLim(Source(Date(c))) + 1), 

Vc G Sik. Relation(c) = after} 

UpperBound(5ifc) = {a\a G Z, 

a = mmc(Distance(c) + RightLim(Source(Date(c))) — 1), 
Vc G Sik. Relation (c) = before} 

Proposition 1. Let ci, C 2 he two temporal constraints of the form {tpi 
after dk tpj) and {tpi before di tpf) respectively which belongs to a group Sik- 
Temporal constraints c\ and C 2 are consistent if and only if di > dk 2. 

The demonstration is trivial by following properties of symmetry, transitive- 
ness and reflexivity. Let’s show the above proposition by an example. Let tpj = 
(3, 3, 0) be a time-point with no constraints defined over it and whose occurrence 
date is at time instant 3, and tpi = (x,^, {Sn)) where Sn = {{tpi after 2 tpj), 
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{tpi before 3 tpj)}. There exists only one group of constraints for tpi which is 
composed of two constraints applied to tpj. By computing Lower and Upper 
bounds of Sn we obtain LowerBound {Sn) = 2 + 3 + 1 = 6 and UpperBound 
i^ii) = 3 + 3 — 1 = 5 thus resulting in an inconsistent interval [6,5]. In this 
way, the distance of a before temporal constraint must be at least two time units 
greater than the temporal distance of an after temporal constraint in the same 
group of constraints applied to the same time-point. 

Definition 6 (joining of constraints). Let Si = {Sn, 5^2, • • • , Sin) for time- 
point tpi . The final temporal window for tpi is ealculated according to the lower 
and upper bound of each Sik . Hence: 

LeftLim(Date(tp^)) = { d\d ^ Z^d = min {LovjerBound (Sik)) V/c G 1, . . . , n} 
RightLim(Date(tpi)) = { d\d G Z, d = max{\JpperBound{Sik)) V/c G 1, . . . , n} 

Temporal information is represented in a graph where nodes correspond to 
time-points and edges denote temporal constraints. 

Each edge or each path with combinable edges (transitiveness property) be- 
tween two nodes gives rise to a temporal constraint between them. Let CAij and 
CBij be the set of all temporal constraints connecting tpi and tpj through the 
temporal relation after and before respectively. For each temporal constraint c 
which belongs to one of those sets, Distance(c) is calculated by successively ap- 
plying property of transitiveness over the time-points which make up the path 
in the graph. 

3 Temporal Data Representation 

3.1 Temporal States 

A temporal fact can be in one of these states: past, current or future. Recall that 
the special time-point now represents the current time, i.e. the number of time 
unit elapsed from Tpo (initial time at which the application execution starts) 
until the current moment. 

Definition 7. A temporal fact tfi = (o^, Si^Vi^hi^Ci) is a past fact if the tem- 
poral constraint {ci before 1 now) holds. 

For a past temporal fact it also holds {bi before 0 now). This means that both, 
Date{bi) and Date{ci) are precise dates, concrete instants of time preceding the 
current time now (or equal to now in the case of the ending time-point). 

Definition 8. A temporal fact tfi = {oi, Si^Vi^bi^Ci) is a current fact if tempo- 
ral constraints {bi before 1 now) and {ci after 0 now) hold. The beginning time 
of a current temporal fact is a perfectly known date and its ending time is only 
partially known. 

Definition 9. A temporal fact tfi = (o^, 5^, e^) is a future fact if temporal 

constraints {bi after 0 now) and {ci after 0 now) hold. In this case, both are 
imprecise dates. 
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3.2 Events 

Events are used to model the temporal changes in temporal facts. There are two 
main modifications that may occur in temporal facts: a current fact may become 
past or a future temporal fact may become current. 

Definition 10. Let tfi = {oi, Si^Vi^bi, Ci) be a current temporal fact at the 
current moment (now); tfi will become a past fact in a future time t' > now if 
an event evj = {oi, Si^Vj^t' ^ orj) occurs (the event brings about a new value for 
the temporal slot si in the object oi where Vi ^ Vj). 

The condition that must hold for an event eVj = (oi^Si,Vj,f^orj) to pro- 
duce a current-to-past modification in temporal fact tfi is: LeftLim(Date(ei)) < 
t' < RightLim(Date(eJ). In this case evj confirms the date for the end of tfi 
and Date(e^) is set to [t\t'\. 

Definition 11. Let tfi = Si^Vi^bi^ef) be a future temporal fact at the current 
moment (now); tfi will become a current fact in a future time t' > now if an 
event evi = {oi^ Si^Vi^t'-, ovi) occurs (the event confirms the future value Vi of 
slot Si in the object Oi). 

The condition that must hold for an event eVi = (o^, Si^Vi^t' , ovi) to produce 
a future-to-current modification in temporal fact tfi is: LeftLim(Date(6^)) <t'< 
RightLim(Date(6i)). In this case evi confirms the prediction and Datefbi) is set 
to [t', t']. 

4 Temporal Causality 

Causal relations are represented in the model by means of temporal constraints. 
The principle of causality states that the effects of a causal relation hold if and 
only if all the causes in the relation already hold in the knowledge base. In other 
words, the beginning time of temporal facts representing the consequences of a 
causal relation can never occur before the beginning time of their causes. This 
statement of causality is extended in our model by allowing to infer new infor- 
mation on the basis of future temporal facts. The new deduced temporal fact 
will hold as future data until all the premises are confirmed, i.e. until the tem- 
poral facts used for deduction become current facts. This means the model can 
perform inferencing based on future data aimed at advancing what is expected 
to occur in the application evolution. Let’s take an example of a possible causal 
relation in the block’s world domain. The classical action of picking up a block 
is stated as follows: 

if (?block status free ?bl ?el) and 

(robotarm status free ?b2 ?e2) 

then 

(?block status holding) 
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The identifier ?block is a variable to be instantiated to the corresponding 
block and values instantiated in ?bl ?el ?b2 ?e2 are the time-points associated 
to the beginning and ending points of temporal facts that satisfy each premise. 
Let tfi and t/2 be the temporal facts that match the first and second premise of 
the causal relation respectively. The situations that may arise are the following: 

— If tfi is a past temporal fact and t/2 is a current or future fact (or inversely) 
then it is clear both conditions will never hold simultaneously because by 
the time the robotarm is free the block’s top is not free any more; in this 
case the system can never infer the block’s status is holding. 

— If, at the time the operator is being evaluated, both tfi and t/2 are current 
temporal facts then we can ensure that both conditions currently hold so 
the block can be held now. 

— If any of the temporal facts is future (and the other is current or future 
too) the system will not be able to deduce the block is held now. However, it 
may be possible to conclude that block A will be held sometime in future. The 
system will infer such an information if there exists at least one future instant 
of time where both temporal facts may hold simultaneously as current facts. 
This is calculated by computing the test of temporal intersection. 

Definition 12 (temporal intersection). Let T = t/i,t/ 2 , . . . Afn be the set 

of temporal faets that mateh the premises of a eausal relation. It is said there 
exists temporal interseetion among the temporal faets at the eurrent time now if 
the following eondition holds: 

\/i^j G [1, . . . , n], z 7 ^ j, {ci after 0 bj) A (e^ after 0 now) 

This test determines if temporal facts matching the premises currently hold 
or may hold simultaneously in a future time. If the result of the test is true then 
the model computes the temporal interval where the conclusions are expected 
to occur or are actually occurring; in the latter case the left and right bounds of 
this interval will be equal and will represent a time instant before or equal now. 

Let tfi = {block A , status , free ,61 , ei) and t/2 = {robotarm , status , free, 
62 ,62) be the two temporal facts that fulfill the premises of the causal relation 
and Date{now) = [ 3 , 3 ]. Assuming both tfi and t/2 are future temporal facts 
with Date(6i) = [ 4 , 10 ] and Date(62) = [ 5 , 9 ], the conclusion will be a future tem- 
poral fact t/3 = {block A , status , holding ,63 ,63). The exact occurrence date 
for t/3 will depend on the last temporal fact, between tfi and t/2, which becomes 
current. This gives rise to three possible different situations (corresponding to 
situations 531, S 32 and ^33 in Fig. 2 ): 1 ) tfi is the last temporal fact to become 
current, therefore block A will be held when its top is free, 2) t/2 is the last tem- 
poral fact to become current, so block A will be held when the robotarm is free 
or 3 ) both tfi and t/2 become current at the same time. 

Time-point 63 is defined as ( 5 , 10 , {Ssi, 532, *S'33)) what means that the earliest 
time instant where block A can be held is at 5 and the latest at 10 . 

Let’s assume that Date{now) = [6,6] and the first event acquired from 
the external world confirms robotarm is free at the current time [6,6] {evj = 




Temporal Representation and Reasoning for Dynamic Environments 215 



{robotarm , status ^ free ,6 ,orj)). Then t/2 becomes a current fact with 
Date(62) = [6,6]. The model updates the temporal window for 63 (Fig. 3 .). 





Since Date{now) = [6, 6] and bloekA is not holding now, the only consistent 
group for 63 is 531 and then 63 = (7, 10, This indicates that bloekA can 

be held at 7 as the earliest time instant. Let’s assume that two time units have 
elapsed and Date{now) = [8,8]. An external event reports that robotarm is 
occupied at that time. Then t/2 becomes a past fact with Date(e2) = [8,8]. It 
is obvious the model can not infer yet that bloekA will be held. But we need to 
add some information in order to detect this situation. 

The test of temporal intersection requires to check that every ending time- 
point occurs after the beginning time-point of each temporal fact. This is a 
temporal relation which must be satisfied in order to infer the conclusions, but 
it is not possible to represent such a relation by posting a temporal constraint 
between the two time-points. In other words, a temporal constraint states a 
temporal-causal relation between two time-points and constrains the temporal 
window of a time-point according to the updatings in the temporal window of 
the other time-point. 

The desired effect is achieved by setting a temporal constraint between the 
beginning of the conclusion and the ending of each of the causes ( (63 before 0 ci), 
(63 before 0 62)). These temporal constraints are used to denote that the exact 
ocurrence date for 63 should be known before dates for e\ and 62. 
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5 Temporal Reasoning 

This section describes the reasoning algorithms of the temporal model. Tasks 
to be performed are asserting, deleting, updating and recovering of the informa- 
tion. Updating and deletion requires from a propagation process in the graph. 
Assertion and consults are carried out by means of a search process. 

5.1 Recovery Process 

The recovery process consists in retrieving the sufficiently restricted (according 
to the constraint expressed in the query) temporal constraint between two nodes 
in the graph. By obtaining existing edges or combinable paths between two 
nodes the recovery process can return three types of answers: TRUE, FALSE or 
POSSIBLE. Following, we provide definitions for a before query while answers 
for an after query can be obtained by applying property 2 on the next definitions: 

Definition 13. q_bef(tpj, tPi) = TRUE ^ 3c G CBij, Distance(c) < dk 

Definition 14. q_bef(tpj, tPi) = FALSE ^ 3c G CAij, Distance(c) > dk — l 

Definition 15. q_bef(tpj, d^, tPi) = POSSIBLE ^ Vci G CBij^C 2 G CAij, 
Distance(ci) > d^ L Distance{c 2 ) < d^ — 1 

A TRUE response is obtained for a query when there already exists a path 
in the graph which represents that temporal constraint. A FALSE answer is 
obtained when the contrary restriction is found in the graph. A POSSIBLE 
answer is obtained when there are no paths which confirm neither a TRUE nor 
a FALSE response. 

5.2 Consistency Test 

The consistency test checks whether the new temporal constraint to be asserted 
is consistent with the rest of information in the graph. To carry out this proof 
the model checks one of the following two conditions, depending on the temporal 
relation type of the constraint: 

Definition 16. A constraint of the form {tpi before dk tpj) is consistent if\/c G 
CAij.dk > Distance(c) + 1 

Definition 17. A constraint of the form {tpi after dk tpj) is consistent if Me G 
CBij.dk < Distance(c) — 1 

From the above definition it can be easily deduced that the graph is consistent 
if and only if there are not path-cycles composed of edges after whose distance 
is > 0 or path-cycles composed of edges before whose distance is < 0. 

Proposition 2 (consistency). A temporal graph is consistent if and only if 
Mtpi. {tpi before dk tpj) dk > 0 and Wtpi. {tpi after dk tpf) dk < 0 




Temporal Representation and Reasoning for Dynamic Environments 217 



5.3 Search Procedure 

The search process is carried out in two phases: a) an interval phase where a 
response is obtained by consulting the left and right limits of time-points and 
b) and expansion phase where a search is performed from the source node to 
the destination by combining temporal constraints. The interval phase consists 
in obtaining a first response to a particular temporal query by substracting the 
left and right limits of the two involved time-points. For a specific time point 
tpi^ a list with all nodes which constitute the most restrictive path between tpi 
and Tpo is maintained. This is a useful information for two reasons: 

— if a temporal search involves two time-points in the same restrictive path, 
minimal/maximal optimal distance separating both nodes can be obtained 
by simply substracting their left /right limits. 

— if the two time-points are not found in the same restrictive path then an 
approximate solution can be obtained by substracting their left-right limits. 

In this second case, there is no guarantee that the obtained solution is opti- 
mal. If the resulting distance does not respond TRUE or FALSE to our query, 
the expansion phase is activated. This is based on a search algorithm where 
parallel differences between time points (substracting both left and right limits) 
is used as an heuristic evaluation. The algorithm is implemented by following [11] 
thus permitting the obtention of linear search costs in practice. 

Let tpi and tpj be two time-points. The maximum before distance of tpj with 
respect to tpi is Vj — li 1 since rj is the latest ocurrence date for tpj and U is 
the earliest ocurrence date for tpi. Consequently, this distance can be used as an 
upper bound in the process of computing a q_bef between the two nodes. That 
is, the goal of the search process is to minimize that distance and the solution is 
progressively refined as a new shorter distance is found in the expansion phase. 
This can also be applied to an after constraint by obtaining a lower bound on 
substracting limits Ij and Vi {Ij — — 1). 



[ 6 . 11 ] 




Fig. 4. An example of temporal graph 



Let’s take the graph in Fig. 4 and the query q_aft(tp4, 8, tpi). By comparing I4 
and ri the response would be POSSIBLE (9—2 — 1 < 8). However, there exists 
a path connecting tp 4 and tpi through tps which responds TRUE to our query 
(5 + 2 + 1 = 8); that is, the combination of constraints which determine the left 
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limit of constitute the most restrictive path from Tpo to tp4 (Date(Tpo) = 
[0, 0], ^4 = 0 + 8 + 1 =9). Since tpi is found on that path, the optimal distance 
between tpi and tp4 can be obtained by computing ^4 — / i — 1. In summary, when 
a time-point tpi is on the most restrictive path from Tpo to another time-point 
tpj, the optimal after/before distance between both time-points is computed by 
substracting their left/right limits. 

The advantage of the proposed method is that the first phase allows to achieve 
a very rapid response which may be the optimal one in some cases. Otherwise, 
the expansion phase can find a more refined solution as more computation time 
is given to the algorithm. 

6 Conclusions 

The main contributions of the presented work are: a) the definition and design 
of a representation framework specially adapted for dynamic environments with 
the ability of reasoning about past, current and future data, b) the specification 
of an internal time model which allows to handle any temporal requirement by 
means of temporal constraints, thus facilitating the temporal data management 
and c) the design of the reasoning algorithms are based, as far as possible, on 
simple arithmetic operations on the limits of time-points, thus being possible 
the computation of an approximate response very rapidly. This is an important 
requirement in dynamic environments as in real-time systems. 
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Abstract. We determine special cases where the behaviour of the non- 
oblivious local search is worse than the behaviour of the classical local 
search. We propose some modifications to the non-oblivious objective 
function in order to cover these cases. We present an empirical analysis 
and comparative results among the analysed algorithms. This empirical 
analysis shows that non-oblivious local search (that uses the new objec- 
tive function introduced here) combined with tabu strategy and the use 
of the complemented value of the last local optimum as a mechanism 
for re-starting the search, obtains in practice, better solutions than the 
classical local seach or non-oblivious local seach alone. 



1 Introduction 

The objective of Automatic Theorem Proving (ATP) is to design efficient algo- 
rithms to demonstrate the validity of a logical formula. In the case of propo- 
sitional logic, it is well known that the Satisfiability problem (SAT) and the 
Maximum Satisfiability problem (MaxSAT) are computationally hard problems, 
in such a way that any known algorithms that solve them require in the worst 
of the cases an exponential number of steps over the length of the input. 

The SAT problem consists of deciding whether a Boolean conjunctive form F 
is satisfiable. The SAT problem is NP-complete even when it is restricted to 
instances with exactly three literals per clause (3-SAT problem). Its optimized 
version, MaxSAT problem, computes the maximum number of simultaneously 
satisfiable clauses in F. MaxSAT problem is NP-complete even for formulas with 
at least two literals per clause. 

Both SAT and MaxSAT problems are central problems for ATP and for 
complexity theory. The interest in SAT and MaxSAT is also motivated by the 
important role that they play as representative problems of their complexity 
class. Therefore, there is a great interest in the design of efficient algorithms for 
the resolution of these problems. 

Recently, there has been a renaissance in the study of effective heuristic 
algorithms based on local search. Starting from the late eighties, both discrete 
and continuous-based greedy algorithms have been proposed (see [2]). Of course, 
it is not our purpose to present an exhaustive list of these algorithms. Instead, 
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we present a new objective function that improves the general behaviour of the 
non-oblivions 1-local search. 

We also present different strategies in order to improve the general behaviour 
of the algorithms based on a local search paradigm. We add to the local search 
a tabu search strategy with the purpose of speeding up the phase to reach a 
local optimum, and we proposed the use of the complement value of the local 
optimum found previously as a mechanism for determining a ‘good’ point to re- 
start the search after arriving at a local optimum and in order to avoid review 
paths already explored. 

The empirical analysis and the comparative results among these algorithms 
show that the non-oblivious local search with the tabu strategy and the use 
of complemented values have robust behaviour and obtain in practice better 
solutions than the non-oblivious local search alone. 

1.1 Preliminary Definitions 

Let X = {xi, . . . , Xn} be a set of n Boolean variables. Let Lit(K) be the set of 
literals: Lit(K) = X U {xj", • . . , xT}- 

A clause (7 is a disjunction of literals. For a natural number /c, a /c— clause 
is a clause consisting of exactly k literals. We’ll denote the number of literals of 
the clause C with \C\. A conjunctive form (CF), or formula is a conjunction of 
clauses. A A;— CF is a CF containing only A;— clauses. 

An assignment A is a function, A : X ^ {true^ false}. There are 2'^ different 
assignments that can be defined over a set X with n Boolean variables. We will 
also consider an assignment A as a set of literals. The sign in which the literal 
appears in A is the logical value that the assignment A gives to the variable x. 
Note that there are no duplicate variables in an assignment. 

A clause is true if at least one of its literals is true. A CF is true if each of 
its clauses is true. 

For any assignment A C Tz^(X), let c(F, A) be the number of clauses satisfied 
by A, i.e.: 

c(C^) = |{* < rn\3j e [l,ki] : kj e A}\, 

evidently, each CF F with m clauses is satisfiable iff there is an assignment A 
such that e{F,A) = m. MaxSAT is the problem that, given a CF, obtains an 
assignment Aq C LitfX) such that c(F, Aq) = Max{c(F, A)|A c LitifX)}. 

Let P(-, •) be a prototypical NP optimization problem: 

Instance: A real function f : D ^ R. 

Solution: A point xq G D such that /(xq) = Max{/(x)|x G D}. 



Let Sol{P{f, D)) be the set of solutions of P{f,D). And Let A : {f^D) 
A{f^D) G D a procedure that proposes for any instance {f^D) a candidate to 
be an element of Sol{P{f^ D)). We say that the algorithm A approximates the 
problem P with performance ratio r G i? if for each instance (/, D) of P we 
have |/(A(/, D))\ > r • |/(x)| for some solution x G Sol{P{f, D)), in other words. 
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the maximized value proposed by the algorithm is within a ratio r of the actual 
maximized value. 

Any polynomial time algorithm A approximating the optimization problem 
and which for any instance (/, D) delivers a solution of value at least within a 
ratio r of the optimal value is said to be a r — approximation algorithm (PTAA). 
The performance ratio r is usually called the guarantee faetor of the algorithm. 
There arise several possibilities: 

1. There is a sequence of algorithms {An}^ such that each An is a PTAA of 

ratio rn and the sequence is such that rn ^n^+oo 1- la this case the 

collection of algorithms {An}^ is said to be a polynomial time approximation 
seheme (PTAS). 

2. There is a PTAA only for some ratio r < 1. 

3. For any r there is no PTAA. 

There is, naturally, the problem of characterizing the optimization problems 
falling in each one of the above categories. 

For many optimization problems, algorithms with small worst-case ratios 
were quickly found. In some other cases, the problem of finding a worst-case 
ratio r algorithm, for any constant r G [0, 1], was proved to be as hard as finding 
an optimal solution. For example, it is known that there is a constant threshold 
c < 1 such that if MaxSAT could be approximated in polynomial time with a 
guarantee factor better than c then P=NP. One explicit constant threshold know 
for Max2SAT is § and § for MaxSSAT [3]. 

However, it is also important to analyse the properties of the MaxSAT prob- 
lem, with the objective of determining the best guarantee factor r that can be 
reached in polynomial time. 



2 Local Search Paradigm 

Among the approximation procedures, a technique that has proved to be efficient 
to find assignments that satisfy a satisfiable Boolean formula, is the local search 
technique [12]. Local search or local optimization is a primitive form of contin- 
uous optimization in discrete search space. It was one of the early techniques 
proposed to cope with the challenging computational intractability of NP-hard 
combinatorial optimization problems. 

The local search method can be analysed through the structure < xq, f,H >, 
where: 

xq - is the point used in order to start the search, 

/ - is the function that we want to minimize or maximize and, 

H - is the real- value distance function or metric between points in search space . 

For the MaxSAT problem, the search procedure operates over the discrete 
space D of possible assignments of the Boolean formula. Usually, the local search 
starts with a randomly- generated solution xq and the ‘Hamming’ distance H is 
used as the metric (The Hamming distance iL(x, y) between two binary strings x 
and y is given by the number of bits that are different between both strings) . 
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A (5— neighborhood of a point x ^ denoted by Ns{x)^ is: Ns{x) = {y ^ 
D\H{x^y) < 5}. Then, two points x and y'mD are (5— neighboring if H{x^ y) < S. 
We will denote Ni{x) with N{x). 

Given a current solution x, a 5— local search algorithm proceeds to review 
all Ns{x) and then to choose the one which produces the best solution according 
to /, or to seek only until finding the assignment that improves the value of /. We 
decide to move immediately to the neighbor that improves the objective function, 
since in practice, we have seen that both strategies reach similar solutions, but 
this option clearly reduces the complexity of the average time of the algorithm. 

The classical local search paradigm, (or oblivious local search (LS) as it is 
called by Khanna), uses as its objective function /, the function that gives, for 
each current assignment, the number of clauses satisfied by this assignment. It is 
also common to use as its objective function the polynomial function that result 
by arithmetization of the formula given (see for example [5], [8]). 

A typical (5— local search which starts from the initial solution xq, looks for a 
point Xi^i e Ns{xi) so that /(x^+i) < f{xi) (or /(x^+i) > f{xi) if the objective 
is maximized). If such a point exists, it becomes the new current solution and 
the process is iterated. Otherwise, Xi is retained as a local optimum. 

Even though local search has been demonstrated to be an efficient strategy 
to solve SAT and MaxSAT when the Boolean formula F has several satisfiable 
assignments, the procedure has the serious problem that is blocked on local 
optimums, but yet more serious; the fact is that the approximation ratios that 
guarantee this paradigm are still far from the theoretical threshold which can be 
reached in polynomial time. 

For example, the classical 5— local search has a guarantee factor of [9] [11] 
although 5 may take any positive integer S = o(n), where n is the number of 
variables and k is the minimum number of literals contained in any clause of the 
formula, i.e. for the Max2SAT problem, the guarantee factor that provides the 
classical local search is 2/3. 

3 Non-oblivious Local Search 

In the design of efficient approximation algorithms for MaxSAT, a recent ap- 
proach of interest is based on the use of non-oblivious functions^ which was 
introduced in [1] and [11]. 

The objective function in local search has been traditionally expressed as a 
function that tries to maximize the number of clauses satisfied by the current 
assignment. But different types of local search can be obtained by using different 
objective functions to direct the search, including functions that originally do not 
show the natural option of maximizing the number of clauses that are satisfied 
by an assignment. 

Given an assignment x, let Si be the set of clauses in which exactly i literals 
are true, and let w{Si) be the total weight associated with the clauses in Si^ 
i.e. w{Si) = - cardinality of S^. Khanna introduced the objective function: 

Inob = J2i=i Ciw{Si) where the differences between two consecutive coefficients 
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Note that this formulation permits us to determine the values for 
each Khanna remarked that the value chosen for cq was not 

relevant to the behaviour of non-oblivious local search. Despite this idea, we will 
show that the value chosen for cq is important in order to improve the general 
behaviour of the non-oblivious local search (NV-LS). 

A local optimum in the classical local search is not necessarily a local opti- 
mum in NV-LS, and in fact, the /nob objective function could ‘jump’ over the 
local optimums of the classical local search, causing different behaviour of the 
search strategy. 

For example, Khanna proposed for the Max2SAT problem, the non-oblivious 
objective function /nob = 3/2u;(*S'i) + 2ic(5'2). Using this objective function, it 
has been shown that a 1-local search achieves a guarantee factor of 3/4. And in 
general, using the objective function /nob = Yl^=i Ciw{Si) that obeys (1), any 
1-local search for Max-/cSAT ensures a guarantee factor of [11]. 

Therefore, the NV-LS improves the guarantee factor of the LS even if the 
search is restricted to reviewing few neighbors. In fact, the guarantee factor 
reached by NV-LS is the same as that wich can be reached using the Johnson’s 
greedy algorithm [10] [13]. 

Although Feige et al. [6] have proposed a randomized approximation algo- 
rithm based on semidefinite programming to resolve Max2SAT, with an approx- 
imation ratio of 0.931, the generalization of this technique for MaxSAT has not 
yet been resolved. 

Analyzing the solutions obtained by programs that implement the classical 
local search as well as the non-oblivious local search, we detected instances of 
Max-A:SAT in which the classical local search reaches a better solution than 
applying NV-LS. 

A simple example of this case is the family of formulas of the type: F = 
{{xi^Xj}\l < i < j < n} VJ {{Ti,T2}}. Considering n = 5 and as the current 
solution X = (1, 1, 1, 1, 1) then /nobIx) = 3/2(0) + 2 (( 2 )) = 20 and, in fact, X 
is a local optimum that does not satisfy F because for every 1-neighbor x', 
fNOB{x)>fNOB{x'), i.e. with = (0,1,1,1,1), /iYos(^i) = 3/2(5) + 2(6) = 19.5. 
Although X is a local optimum under NV-LS, its neighbor xi is a better solution 
since x\ satisfies F. 

Although NV-LS has a better guarantee factor than LS, the classical local 
search from the same current assignment x = (1,1, 1,1,1) obtains a better solu- 
tion because it finds that x\ is the local optimum. 

In order to correct this situation, we determined a new non-oblivious objec- 
tive function for Max2SAT defined as: 




k 



( 1 ) 



fNOB = 2w{S 2) + 3/2w{Si) — w{So)- 



( 2 ) 
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We have defined this objective function in order to consider the number 
of unsatisfied clauses that appear in each evaluated assignment. Note that the 
values of the coefficients could be used to differentiate among 

assignments that have the same number of satisfied clauses, but the coefficient cq 
is critical in order to consider the number of unsatisfied clauses that appear in 
each evaluated assignment. 

Let /nob = Ciw{Si) the objective function for the NV-LS, without loss 
of generality, we assume that the variables have been renamed such that each 
unnegated literal is assigned the true value. And let Pij and Nij the number 
of clauses in Si containing the literals Xj and Xj respectively. Considering x as 
1-local optimum and denoting with ^ the current change in /nob when the 
value of the variable Xj is flipped, i.e. when Xj changes from 1 to 0. And let us 
consider m to be the total number of clauses of the instance. 

As X is a 1-local optimum, then for 1 < j < n. 

c p k k 1 

T = — AkPk,j — + ^1^0, j ^ 0 



Hj ~ Ai-^iNij + AiNqj — AkPkJ — ^ 0- 

Thus, AiNoj < AkPkj + - A+iNij). 

Summing over all values of j and using the fact that ~ 

= {k — i)w{Si), we obtain the following inequality: 



k—l 

kAkw{Sk) + - {k- i)Ai+i)w{Si) > kAiw{So). ( 3 ) 

i=l 

(Note a small difference with the summation low index obtained by Khanna[ll]). 
For example, for the Max2SAT case, formula ( 3) is written as, 

2A2w{S2) + {Al — A2 )w{Si) > 2Aiw{Sq). (4) 

Thus, we want to determine values for cq, ci, C 2 in such way that Ai will be a 
big positive integer while A 2 and Ai — A 2 will both be, small positive integers. 
For example, if we determine that the relation w{Si) < w{Sq) always holds for 
a class of formulas, then defining cq = — 1; ci = |; C 2 = 2 (as in ( 2), we obtain 
that Z\i = I and A 2 = Then ( 4) is transformed in to 

w{S 2 ) + 2w{Si) > bw{So). (5) 

Let us denote the number of unsatisfied clauses with w{Sq) = UNSAT and, 
with the number of satisfied cluases w{Si) P w{S 2 ) = SAT. Then SAT + 
UNSAT = m. From( 5) we determine that SAT+u;(5'o) > SAT+u;(5'i) > 5UNSAT, 
obtaining the inequality SAT > 5- UNSAT — UNSAT = 4- UNSAT, and obtaining 
a guarantee factor of | for this class of instances. 

In order to improve the approximation factor that can be reached using non- 
oblivious local search over Max2SAT instances, we must analyse the ratios 
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and . For example, if we can find that for a given instance ie(5i) < p'w{So) 
always hold, then we can find adequate values for the coefficients cq, ci, C 2 , in such 
way that the coefficient of each term on the left hand side of the inequality ( 4) 
will be minimum (Khanna looks for the unity in each term) while the coefficient 
on the right hand side will be maximum. 

Of course, we do not know the value for p until the algorithm finishes pro- 
cessing the given instance. Therefore, we believe that the definition of the co- 
efficients Ci,i = 0,...,/c(at least for the coefficient cq) should be defined in a 
dynamic way, so they can be adjusted during the execution of the NV-LS, with 
the objective of reaching the maximum approximation factor for each instance. 

In order to improve the general behaviour of the algorithm, it is adequate to 
use strategies that permit us to speed up the search, ‘jump’ over local optimums 
and determine a ‘good’ point for re-starting the search after a local optimum is 
found. 

4 History-Based Heuristics 

In order to improve the general behaviour of our implemented algorithm based 
on the non-oblivious local search paradigm, we added a tabu search strategy 
used in order to speed up the phase of reaching a local optimum. Moreover, we 
used the complement of the last local optimum found in order to avoid review 
paths already explored and as a mechanism for determining a ‘good’ point to 
‘re-start’ the search process after arriving a local optimum. 

Analyzing the behaviour of the local search paradigm, we can see that the 
process of search could be divided into two phases. In the first phase, called 
‘ascent-hiir, wich occurs relatively quickly, a neighbor can be found to the current 
solution that improves the objective function. This phase is relatively short [7] 
and it is followed by a second phase, called ‘Platen’. In this second phase it 
is hard to find a neighbor that improves the objective function, therefore it is 
usual to visit points that maintain the same value of the function. It is in this 
second phase that the greatest quantity of time in the search process generally 
is invested. 

One of the strategies that traditionally has been applied to improve the 
behaviour of the paradigm of the local search is the tabu heuristic. In [5] and 
[9] , a tabu search has been used as an additional strategy to speed up the search 
for a local optimum during the ‘Platen’ phase. 

In both methods, the tabu strategy is based on the use of an array that 
maintains the direction of the ascents found during the flips of variables. The 
values of the array are used in order to forbid inverse movements, at least during 
a fixed number of iterations of the algorithm. An array is introduced to indicate 
which variables give positive changes for the objective function in the search 
process. When a variable gives a positive change in the objective function, the 
difference of the change is stored in the position of this variable. Local changes 
in the ascending direction are performed while the value of the position of this 
variable is zero. 
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For example, a positive value in the position j of the array indicates a positive 
local change in the variable j ; therefore in the next e (assuming that this positive 
change was e) iterations of the algorithm, the reverse move is forbidden in this 
position. 

The use of the array to store the ascending directions is useful to reduce the 
number of tested points and speed up the search for local optimums. 



4.1 Use of Complemented Values to Restart the Search 

Different history-based heuristics have been proposed to continue local search 
schemes beyond local optimality. These schemes help to intesify the search in 
promising regions and to diversify the search into uncharted territories by using 
the information collected from the previous phase of the search [2] . 

Our search tabu proposal proceeds as follows; we assume that the configu- 
ration is currently locally optimal. Then only in this situation, a perturbation 
or ‘kick’ is applied to the optimal point in order to generate a new point to 
restart the search, in such way that the search ‘jumps’ from one local optimum 
to another. 

The difference between the tabu search and the probabilistic searchs as the 
‘annealing heuristic’, where the test of ‘accept /reject’ is applied in each point 
that obeys a probabilistic parameter, is that in our tabu search we only jump 
until arriving at a local optimum. 

In this plan, it is convenient to use an array where we can store the local 
optimums that were found, in order to avoid to falling into a loop. Tracing the 
list of local optimums that where found under the algorithm, we observe that 
sometimes a same local optimum was obtained, indicating that the search goes 
over regions previously explored. This effect can be avoided, if we can dehne 
appropiately way the new ‘restart’ point of search, after it has arrived at a local 
optimum. 

For example, if we suppose that xq is a local optimum, then a ‘kick’ can be 
applied to it to make the control of the search jump to a new point xi that 
defers significantly of xq. We use the complement of last local optimum found 
as a point to ‘re-start’ the search, and if this new point belongs to an already 
visited path, then a new point can be built that defers to each one of the local 
optimums that have been kept in the array of local optimums. 

The use of complemented values of the last local optimum for restarting the 
search permits us to diversify the area of search with the intention of avoiding 
review paths already explored. Furthermore the use of the local optimum array 
avoid falling in loops, obtaining heuristic that turns out to be a robust search and, 
according to the results of the empirical analysis, that improves the behaviour 
of the non-oblivious local search. 

Finally, the success of this algorithm is determined, first by its ability to 
move successfully through the ‘Plateu’ phase, reducing the time of this phase, 
and second, by the use of the complement of the last local optimum found as a 
new point to restart the search, giving amplitude to the search. 
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5 Computational Results 

For experimental purposes, we have realized a brief empirical analysis and com- 
pared the results obtained by local search (LS), non-oblivious local search as de- 
fined by Khanna (NV-LS), and our non-oblivious local search, (NTA-LS) which 
is combined with the two heuristics presented here (tabu strategy and use of com- 
plements). The three algorithms were executed for small instances of formulas 
in Conjunctive Form (CF) that were randomly built. Each CF is characterized 
by n,m and /c, where n is the number of variables, m is the number of clauses 
and k is the number of literals in each clause. In our case, we are considering 
only instances of 2-FC, so that k is always fixed to 2. We assume that no clause 
is a tautology. The method we have followed to build the random formulas is as 
described in [7] and [8]. 

We have also programmed an exact algorithm for Max2SAT in order to know 
how close the various heuristics get to the true optimum. The maximum number 
of trials for each algorithm is bounded by similar values, in such a way that each 
algorithm had similar running times (particularly, bounded in polynomial time) . 

For each one of the three algorithms and for a given instance, we generated 
10 random initial points and solved each one of these instances 10 times. The ten 
values obtained for each instance were averaged and this is the average number of 
satisfied clauses Z for such an instance. We then estimated the ratio between Z 
and the true optimum. This ratio is calculated for each one of the ten instances 
that conform a group, the mean of the ten ratios is calculated and it is the 
approximation ratio that we are plotting for each n and m fixed. 

It should be noted that we are considering small instances that cover the area 
known as phase transition for Max2SAT problem, i.e. m G [n, 5n], an area where 
it is assumed that the instances are typically much more difficult to solve than 
others away from this phase transition [4]. This region is generally considered 
to be a good source of hard instances for MaxSAT (and SAT) and has been the 
focus of recent experimental effort. 

The experimental analysis shows that in the practice, the algorithm based on 
our non-oblivious local search with tabu strategy and the use of complemented 
values gives the closest solutions to the optimal true value. 

The graphs in figures 1, 2 and 3 show the average of the approximation factor 
obtained for each one of the three algorithms, where n = 15, 20 and 25 and m 
goes from [25, 100], [25, 100] and [50, 125] respectively. 



6 Conclusions 

It is possible to improve the approximation factor obtained using non-oblivious 
local search over Max-/cSAT instances. If we can find, for a given instance, a 
relation between w{Si) < p • w{Sq) for some i = 1, . . . , /c then by exploiting this 
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Fig. 1. Graph with the means of the approximation ratio for each algorithm for 
instances 2-CF, with n=15 variables. 




Fig. 2. Graph of the average approximation ratio for each algorithm for instances 
2-CF, with n=20 variables. 




Fig. 3. Graph of the average ratio appproximation for each algorithm for in- 
stances 2-CF, with n=25 variables. 
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relation, it is possible to define appropiate values of the coefficients q, i = 1, . . . , /c 
for the non-oblivious objective function in order to achieve the best aproximation 
factor for this instance. 

We present different strategies that preserve the polynomial-time complex- 
ity of the proposed algorithms and improve the general behaviour of the non- 
oblivious local search. Mainly, we added a tabu search heuristic with the purpose 
to speed up the phase to reach a local optimum, and we proposed the use of 
complemented values as a mechanism for restarting the search after arriving at 
a local optimum in order to avoid reviewing paths already explored. 

The empirical analysis and the comparison of the results of these algorithms, 
show that the non-oblivious local search (that uses the new objective function 
introduced here) combined with the tabu strategy and the use of the comple- 
mented value of the last local optimum for restarting the search, has a robust 
behaviour and obtains in practice better solutions than the non-oblivious local 
search alone. 
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Abstract. The three most well-known semantics for negation in the 
logic programming framework are Clark’s completion [Cla78], the sta- 
ble semantics [GL88], and the well-founded semantics [vGRSOl]. Clark’s 
completion (COMP) was the first proposal to give a formal meaning to 
negation as failure. However, it is now accepted that COMP does not 
always captures the meaning of a logic program. Despite its computa- 
tional and structural advantages, the well-founded semantics (WPS) is 
considered much too weak for real applications. The stable semantics 
(STABLE), on the other hand, is so strong that many programs become 
inconsistent. We present in this paper examples to support these claims, 
and we introduce a new semantics, called OWES, which is as powerful as 
COMP in inferring positive literals and as powerful as WES in inferring 
negative literals. Due to its particular construction, CWES helps to un- 
derstand the relationship among COMP, WES, and STABLE. We also 
discuss some implementation issues of CWES. 

Keywords: Knowledge Representation, Non-monotonic Reasoning, Well 
Eounded Semantics, Stable Semantics, Clark’s Completion, Normal Pro- 
grams, Logic Programming. 



1 Introduction 

In the field of logic programming, a ’normal program clause’ is defined as a 
definite program clause with possibly negated literals (^) in the antecedent of 
the clause [Llo87]. Negation here is interpreted as negation as failure^ and thus 
we have a departure here from classical logic to a nonmonotonic logic. The first 
proposal to provide a plausible declarative semantics for negation-as-failure app- 
peared nearly 20 years back [Cla78], and it is now referred to as the Clark’s com- 
pletion semantics. However, it is now accepted that Clark’s completion is often 
too weak and does not always capture the intended meaning of logic programs, 
especially for knowledge representation tasks ([BG94,BD96b]). Hence it became 
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necessary to find new approaches for specifying the semantics of logic programs. 
This led to the discovery of two quite dominant approaches: the well-founded se- 
mantics (WFS) [vGRS91], and the stable semantics (STABLE) [GL88]. Despite 
its computational and structural advantages, it has been observed that WFS has 
as the drawback that it does not infer all atoms that one would expect to be 
true (see [AB94,Sch92]). 

Let us consider the following example considered in [AB94,BLM90,Dix95a], 
which is representative for the problems with reasoning by cases. Let P be 
a < — 
b < — ~^a 
p ^ a 
p ^ b 

The authors in [AB94,BLM90] argue that since neither a nor b can be derived in 
any semantics based on two- valued models, the disjuntion aV6, and hence also p, 
should be true. WFS(P) does not fulfill this as pointed out in [AB94,Dix95a]. 
But observe that STABLE as well as COMP derive p. Our proposed semantics 
OWES also derives p. Consequently, lots of extensions of WES have been defined 
in recent years to address this situation. 

STABLE on the other hand is inconsistent for many programs. Consider the 
following program P: 
b ^ a 
a ^ b 
a < — ^b 

Then P does no have any stable model. One can argue by reasoning by cases 
(on h) that a should be a consequence of P. Once accepted this fact we observe 
that b is also a consequence of the program. So, the intended model of P is {a, b}. 
This is what COMP(P) defines. And OWES behaves as COMP for this program. 

We argue that COMP is very weak to infer negative literals but not so to infer 
positive literals. On the other hand, WES is strong enough to infer many negative 
literal but very weak to infer positive atoms. In general, STABLE derives many 
literals and so it becomes inconsistent in cases where we argue that there is an 
intended model. 

We agree that “the diversity of different approaches in semantics of nega- 
tion suggests that there is probably not a unique intended semantics for logic 
programs. Which semantics should be used depends on concrete applications. 
To be able to chose the “right” semantics among different ones, it is of great 
importance to understand the inherent relations between them” [Dun95]. 

Based on these observations, we define a new semantics called OWES which 
combines WES and COMP in a suitable way. Roughly speaking, OWES uses the 
WES power to derive negative literals and the COMP power to derive positive 
literals. We prove that OWES lies “in between” WES and STABLE, namely, that 
WES <k OWES <k STABLE (where <k is the knowledge ordering to be defined 
in section 2). Due to the particular construction of OWES (based on a confiuent 
calculus for WES and COMP) and our results of OWES, we are not only suggest- 
ing a new semantics but we are contributing to understand better the relation- 
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ship between COMP, WFS and STABLE. The unique aspect of OWES respect 
to the different semantics invented so far (as for instance [Dix95a,Dun95,Sch92]) 
is that (to our knowledge) this is the only proposal that is given that com- 
bines two well known semantics to get as a result a semantics that fits so well 
in between WES and STABLE. In addition, our current research suggests that 
OWES can be used to provide a direct semantics to the class of programs stud- 
ied in [JOM95,OJ96,OJ97]. A formal discussion about this claim is outside the 
scope of this paper, nevertheless we give a simple example of this nature in sec- 
tion 3. We restrict our attention to programs P, such that ground(P) is finite 
(ground is defined in section 2). The generalization to consider any program is 
possible but involves several technical difficulties that could hide the main idea, 
and therefore we left it out of this paper. 

Our paper is structured as follows. Section 2 gives the general background 
required in the paper. Section 3 gives examples about the problems of COMP, 
WES and STABLE. In section 4 we introduce CWES and we show how OWES 
overcomes the problems mentioned in section 3. Einally, in section 5 we present 
conclusions and discuss some implementation issues related to CWES. 

2 Background 

We first review the definition of propositional classic logic. 

A signature is a finite set of elements that we call atoms. Classical proposi- 
tional calculus can be defined over the set of well formed formulas defined using 
a signature and the two logical connectives ^ and the rule of modus pones 
and the three axiom schemas: 

al) Q^i — > (q ^2 — ^ 

a 2 ) (ofi ^ (^2 ^ as)) ((<ai ^ 0^2) ^ (<^i ^ 0^3)) 
a 3 ) — > ai 

It is not hard to show that classical logic needs al, a2 and a3; We use to 
denote provability in the restricted formal system with just al and a2. 

We now review general concepts related to logic programming. We assume 
that the reader has familiarity with standard notions as terms, atoms, literals 
and formulas. A literal is an atom or the negation of an atom a that we denote 
by ^a. Given a set of atoms {ai, . . . , , we write . . . , to denote 

{^ai, . . . , ^On}. We may denote a normal clause C as usual [Llo87]: a :- /i, . . . , 
where a is an atom and each U is a literal; or by ^ where B^ 

contains all the positive body atoms and B~ contains all the negative body 
atoms. We also use bodyiC) to denote B^ U ^B~ . A program is a finite set 
of clauses. Sometimes we will consider the logical constants t and f with their 
intended interpretation. By ground{F) we mean the Herbrand instantiation of 
the program P. An interpretation based on a signature £ is a disjoint pair of 
sets < h^h > such that 7i U /2 C £. Given two interpretations I = (/i,/ 2 ), 
J = (Ji, J 2 ), we define / <^ J iff 7-i C *7^, i = 1, 2. Clearly is a partial order. 
We may also see an interpretation (7i,72) as the set of literals 7i U ^72. When 
we look at interpretations as sets of literals then <k corresponds to C. 
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2.1 Semantics of Normal Programs 

The semantics of normal programs have a departure from classic logic. The logic 
programming community has decided to use negation as failure (NF): if A is 
a ground atom 

the goal ~^A succeeds if A fails, and 
the goal -^A fails if A succeeds 

This is clearly not justifiable for classical negation, at least not relative to the 
given program P; the fact that A fails from P does not mean that one can prove 
-^A. For example, if P is 



A < — 

then B fails and so ~^B succeeds and therefore A succeeds. But A is not a log- 
ical consequence of P. For the reasons that are given for the preference of NF 
over classical negation, see [Sch92a]. Since NF is defined in terms of the oper- 
ational semantics, and as we said before, in principle, a programmer should be 
only concerned with the declarative meaning of her program, it becomes im- 
portant to provide a convincing declarative semantics to NF. The first proposal 
was given 1978 and it is called the Clark’s completion. The main idea is that, to 
deduce negative information from a normal program, we could “complete” the 
program by adding the only-if halves of the definitions of the predicate symbols. 
For example, the completion oi A ^ ~^B is A ^ ^P, and also ~^B only if there 
is no definition for B. (For the details about Clark’s completion, see [Llo87].) 
However, it is now accepted that the Clark’s completion does not always cap- 
ture the intended meaning of logic programs. Let us consider the well-known 
Unreachable program. Let P be as follows: 
edge(a, b). edge(c,d). reachable(a). 
reachable(X) ^ reachable(Y), edge(Y, X). 
unreachable (X) ^ ^reachable (X). 

Here, edge (a, b) means that there is a directed edge from a to b. 

We obviously expect vertices c,d to be unreachable, and indeed, Clark’s seman- 
tics implies it, i.e., 

comp(P)\= unreachable (c) and comp(P)\= unreachable (d) 

Suppose we add to P the clause edge(d, c) and call the resulting program P' . 
Although we still expect that c and d are to be unreachable, the Clark’s seman- 
tics of P' does not imply that c and d are unreachable. This example illustrates 
well why COMP is weak to infer negative literals. Moreover, by the compactness 
theorem, no first-order formula can express the concept of transitive closure. 
This result imposes a fundamental restriction of any semantics based only in the 
notion of logical consequence. 

Hence it became necessary to find new forms for specifying the semantics of 
logic programs. A new approach emerged that now is known as the canonical 
model approach. 
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2.2 Canonical Models 

This approach was initiated by the idea of stratification [ABW88]. A program 
is stratified if there is no recursion through negation. With the stratified seman- 
tics, our program example P' is broken in two parts (called strata) P{ and P^^ 
where P{ is: 

edge(a, b). edge(c,d). edge(d, c). 

reachable(a). reachable(X) ^ reachable(Y), edge(Y, X). 
and P 2 is the single clause: unreachable (X) ^ ^reachable (X). 

Then we first compute the minimal model of P{. The result will fix the seman- 
tics of the predicates of this stratum, i.e. we will get the semantics for edge and 
reachable. Then we compute the minimal model of P 2 with the interpretation 
of the “lower” level predicates (in this case edge and reachable) fixed. The re- 
sulting semantics agrees with the intended meaning. Since not every program is 
stratified, this idea had to be extended in a number of ways, such as local strati- 
fication^ weak stratification^ modular stratification and effective stratification^ see 
([AB94,BG94]. 

The stable semantics [GL88] and well-founded semantics [vGRS91], are gen- 
eral approaches to assigning semantics to a logic program that generalize the 
approaches based on stratification. We will briefiy describe the stable semantics 
(SM) and well-founded semantics (WFS) for (possible) infinite ground programs. 

Say that a Herbrand model M of a program P is supported if for every 
atom A ^ M there exists a clause whose consequent is A and whose antecedent 
is true in M . Say that a Herbrand model M of P is well- supported if it is 
supported and there exists a well-founded partial ordering < on M such that 
for any atom A G M there exists a clause in P with consequent A and for 
every positive literal B in the antecedent of the clause, we have B < A. For any 
normal program P, the well-supported models of P are defined to be the stable 
models of P. When the program has a unique stable model then it becomes the 
stable declarative semantics of the program, otherwise the declarative semantics 
of the program is undefined. Since many programs could considered inconsistent 
by having several stable models, it is useful to adopt the the sceptical view of 
the stable semantics, which consists in defining a literal I as a consequence of 
STABLE if I is true in every stable model of the program. Only to be congruent 
with two- valued classic logic, we say that STABLE derives every literal when P 
lacks STABLE models. (In a real problem we would perhaps prefer to say for 
this case that STABLE derives no literal at all, i.e. the empty set.) 

We now discuss the well-founded semantics of a program P. Eirst we define 
a partial interpretation / as a consistent set of literals (i.e., not both a and 
belong to I) whose atoms are in the Herbrand base of P. We say a literal is 
true in I if it is in /, and we say it is false in I if its complement is in I. Let 
the Herbrand base H of P and its partial interpretation I be given. We say 
A C P is an unfounded set (of P) with respect to I if each atom p ^ A satisfies 
the following condition: Eor each clause P of P whose head is p, (at least) one 
of the following holds: (i) Some (positive or negative) subgoal q of the body is 
false in /, (ii) Some positive subgoal of the body occurs in A. Now, the greatest 
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unfounded set (of P) with respect to /, denoted by Up{I)^ is the union of all sets 
that are unfounded with respect to I. 

To define the well-founded semantics, three transformations Tp^Up and Wp 
are defined as follows: (i) p G Tp(J) iff there is some instantiated clause R oi P 
such that R has head p, and each subgoal literal in the body of R is true in /; 
(ii) Up{I) is the greatest unfounded set of P with respect to /; and (iii) Wp{I) = 
Tp{I) U -^Up{I). Let a range over all countable ordinals. The sets lo, and 
whose elements are literals in the Herbrand base of a program P, are defined 
recursively as follows: For limit ordinal o;, = U/ 3 <a^/ 3 - Note that Jq = 0 . For 

a successor ordinal a = /3 + 1, = Wp{Ij 3 ). Finally, define I a- The 

well founded semantics of a program P is the “meaning” represented by the 
limit 

To study in detail several issues related to semantics for normal programs as 
as well as current extensions of WFS, see [Dix95a,BD96b,BG94,D097b,D097c]. 
To our knowledge, our proposal is different to any other idea presented so far. 



2.3 Semantics of Programs 

Following [Llo87], a program is a finite set of program statements, each of which 
has the form a ^ FF, where the head a is an atom and the body W is an 
arbitrary first order formula. We follow [Llo87] and consider them as macros of 
normal programs. Such approach has been followed by [HL94], [Nai86]. In [Pet97] 
this approach (with some variants in the translation) is studied respect to the 
stratified, WFS and STABLE semantics. 

3 Problems of COMP, WFS and STABLE 

Example 1:. We have seen in section 2.1 how COMP fails to give the intended 
meaning to the Unreachable program. It was unable to derive the negative 
literals ^ unreachable (c) , ^ unreachable (d) . The WFS semantics gives the 
intended semantics to this program since the notion of unfounded sets is strong 
enough to derive the intended negative literals. Our proposed semantics CWFS 
agrees with WFS in this case. 

Example 2:. Our example here is taken from [vGRS91]. Let P be the program: 
P ^ 

Then comp(P) is inconsistent. Let us add to P the “harmless” clause p ^ p and 
call this program Pi. Now comp(Pi) is consistent and derives p as well as ^q. 
On the other hand, if we add to P the “harmless” clause q ^ q and call this 
program P 2 it turns out that comp (P 2 ) infers q as well as ~^p. The main criticism 
of [vGRS91] is that it should not be the case that the semantics of the three 
“similar” programs P, Pi and P 2 differs that much. STABLE adopts a “uniform” 
possition in the sense that it gives the same answer to the three programs, that in 
this case corresponds to define no intended model for any of them (i.e., STABLE 
lacks of STABLE models). Here, CWFS behaves as STABLE. 
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As already said, the main problem with WFS is that it is considered much 
too weak for real applications. Let us consider again the following example used 
in the introduction. 

Example 3:. This example is representative for the problems with reasoning by 
cases. Let P be 
a < — ~^b 
b < — -^a 
p ^ a 
p ^ b 

As mentioned before, one can argue that p should be derivable. WFS(P) does 
not fulfill this as pointed out in [Dix95a]. When WFS can not infer any negative 
literal, its power to derive positive literals reduces to the power of the inference 
system (defined in section 2) that indeed is very weak. STABLE as well as 
COMP derive p. So thus our proposed CWFS semantics. 

The following two examples illustrate problems in WFS as well as STABLE. 
Example 4: Consider the program P 
c ^ c 
b ^ a 
a ^ b 
a < — ^6, ~^c 

Then P does no have stable models and WES(P) = 0. Our following argument 
explains that {a, 6, ^c} could be considered as the intended model of P. Eirst, 
we should be able to remove tautologies without changing the semantics of the 
program, getting: 
b ^ a 
a ^ b 
a < — ^6, -^c 

Since c does not occur as the head of any clause of the program, by “negation 
as failure” we can infer and so we also infer a ^ ~^b. By the pair of clauses: 
a ^ b 
a < — ~^b 

and reasoning by cases we can derive a. Einally, by modus ponens applied to a 
and 6 ^ a, we get b. So, the intended model of P is {a, 6, ^c}. CWES behaves 
in this way. 



4 Definition of CWFS 

We remind the reader that our (normal as well as general) programs P have the 
restriction that ground{F) is finite. We need the following transformation rules 
(see [Dix95a,BD97,BZE97]) that we will apply to ground{F): 

RED+: This transformation can be applied to P, if there is an atom a which 
does not occur in HEAD(P). RED+ transforms P to the program where all 
ocurrences of are removed. 
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RED“: This transformation can be applied to P, if there is a clause a ^ G P. 
RED“ transforms P to the program where all clauses that contain in 
their bodies are deleted. 

TAUT: (Tautology) Suppose P contains a clause which has the same atom in 
its head and in its body. Then we remove the given clause. 

Success (S): Suppose that P includes a fact a and a clause q ^ Body such 

that a G Body. Then we replace the clause q ^ Body by q ^ Body \ {a}. 

Failure (F): Suppose that P includes a fact a and a clause q ^ Body such 

that a ^ HEAD{P). Then we erase the given clause. 

Loop Detection (Loop): We say that P2 results from Pi by Loop a iff there 
is a set A of atoms such that for each clause a ^ Body G Pi, if a G A, then 
Body n A 7 ^ 0, P 2 := {a : —Body G Pi : Body H A = 0}, Pi 7 ^ P 2 . 

It has been shown that Loop + RED“ + RED“ + S + F defines a 
confluent and terminating calculus over finite ground programs [BZF97]. The 
normal form of this program is called remainder. It is not hard to see that if we 
extend the system by adding the Taut reduction the calculus remains confluent 
and terminating. We call this new normal form rem. 

What are the minimal requirements we want to impose on a semantics? 
Certainly we want that facts, i.e. clauses with empty bodies are true. Dually, if 
an atom does not occur in any head, then its negation should be true. This gives 
rise to the following definition, that we can also call the explicit semantics of a 
program. 

Definition 1 (SEM^^^). 

For any program P we define HEAD{P) = {a\ a^B^, ^B~ G P} — the set 
of all head- atoms of P. We also define 



where 

ptrue._^p^p^ gP}, := {p\ p e jCp\HEAD{P)} 

Showing WFS(P)=SEM^^^(5froi^nd(remam(ier(P))) is one of the main results 
of [BZE97]. This result holds if we use rem instead of remainder. 

Definition 2 (Definition of def and sup). 

Let P he ground normal program and let a he an atom, hy the definition of a 
in P, we mean the set of clauses: {a ^ body G P}, that we denote hy def (a). 
We define 

sup(a):=l ^ ifdef{a) = 0 

( hodyi V ... V bodyn if def (a) = {a ^ hodyi , . . . , a ^ bodyn} 

Definition 3 (COMP(P) ([Cla78])). 

For any ground normal program P and a set of atoms A we define COMP(P) 
over as the classical theory {a ^ sup{a) : a G A}. 



^ In the classical definition of COMP(P), A is the Herbrand Base of P 
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Definition 4 (CWFS(P)). 

Let P he a normal program, we define COMP-WFS(P):= 
COMP (rem (ground(P) ) , where COMP is defined over the Herhrand Base of P. 
A Herhrand model of COMP-WFS(P) is ealled an intended model of P with 
respeet to CWFS. The hasie CWFS semanties of P is defined as the unique in- 
tended model of P with respeet to CWFS. When there many or none intended 
models, we say that the program is ineonsistent. The seenario CWFS semanties 
of P is defined as the set of intended models of P. Finally, the seeptieal CWFS 
semanties of COMP-WFS(P), denoted as CWFS(P), is defined as 

[I : COMP-WFS(P) h I, where I is a ground literal of COMP-WFS(P)} 

Note that the sceptical CWFS is inconsistent only when it lacks of intended 
models. In this case it proves every ground literal. (But we could redefine this 
notion and state that in this case it derives no ground literal at all.) Unless 
stated otherwise, we assume the sceptical view of the CWFS semantics. 



4.1 Examples. 

We explain how to find the CWFS semantics by using some examples from 
section 3. To simplify the presentation we only consider propositional normal 
programs here. Therefore ground{F)=F. 

We start with example 2. We apply the RED+ reduction to get: 

p^^p 

Since we can not apply any other reduction, this is the rem(P) normal form of P. 
Now, COMP(rem(P)) = {p ^ ~^p, ^q}. Thus, the program is inconsistent. If we 
consider program Pi, we have to apply Taut and RED+ to obtain rem(Pi). 
So, rem(Pi)=rem(P) and we can check that rem(Pi)= rem(P)= rem(P 2 ). So, 
for CWFS the three programs are inconsistent. The same situation occurs with 
STABLE. Both STABLE and CWES avoid the irregular behavior of COMP that 
gives very different semantics to the three similar programs. 

We now consider example 3. Here, no reduction can be applied and so we 
only need to complete the program to get |a ^ -^b, b ^ ^ a V b)}. 

Therefore CWES(P) = {p}. 

We now consider example 4. Here, we can apply Taut, Red+ to get the 
normal form red(P): 
b ^ a 
a ^ b 
a < — -^b 

And the completion of the program is {b ^ a, a ^ {bW^b)}. Therefore CWES(P) 
= {a, b, ^c}. 

The following result is immediate by the construction of CWES (due to lack 
of space we omitt the proofs). 



Theorem 1 ((closure properties of CWFS)). 

CWFS is elosed under eaeh transformation Loop, RED“, RED+, S, F, Taut. 
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Theorem 2 ((STABLE models are intended models WRT CWFS)). 

For every normal program P, if M is a stable model of P then M is an intended 
model of P with respect to CWFS. 

The following theorems assume the sceptical approach of the given semantics. 

Theorem 3 ((CWFS is consistent in more cases than STABLE)). 

For every normal program P, STABLE(P) is consistent implies that CWFS(P) 
is consistent, but the reciprocal is false. 

Theorem 4 ((CWFS is in between WFS and STABLE)). 

For every normal program P, WFS(P) <k CWFS(P) <k STABLE(P). Moreover, 
the inequalities are strict. 

For (general) programs, we define CWFS(P) := CWFS{trans-to-normal(P)), 
where trans-to-normal is the translation given in chapter 4 of [Llo87]. 

5 Conclusions and Further Work 

Several authors have pointed out some shortcomings of COMP, STABLE and 
WFS. We review some program examples and present new ones that support 
this claim. COMP lacks for a good machine to derive negative literals, while 
WFS tends to be too weak while STABLE too strong. We propose the new 
semantics OWES that tries to reduce the problem by adopting a stronger form 
than WES but staying weaker than STABLE. Our research suggests that OWES 
can be used to model aggregation, where WES is too weak but STABLE is 
too strong, see [OJ97]. This is however, material for a future paper. Moreover, 
OWES is closed under well known tranformations such as Taut, S, F, etc. 
Thanks to confluence and termination, these transformation rules have both a 
declarative and an operational meaning. ^Erom the declarative point of view, 
they tell us that our semantics is closed under the given transformation rule. 
Erom an operational point of view the transformations are computable functions 
that can be applied to simplify the program. With regard to implementation 
issues, it may be noted that computing ground{F) is a very costly operation. 
In [BZE97] it is shown that to compute remainder {groundfP)) we can start with 
a subset of ground{F) that we can compute more efficiently. Their result also 
applies to rem. The same authors explain in [BZE97a] that “in any practical 
implementation, one would not apply the transformation to the entire set of 
clauses but partition the program clauses according to the strongly connected 
components of its static dependency graph”. They also present a strategy of 
transformation applications that provides an efficient form to make the entired 
reduction. The algorithm is polynomial time w.r.t. to size of the EDB. Eor several 
programs rem{ground{F)) is already a set of fact and so we can immediately 
compute OWES skiping the completion part. Then, as shown in [BD95], we 
can compute OWES by doing hyperresolution on rem{ground{F) . Another idea, 
could be to transform rem{ground{F)) into a set of constraints and to use linear 
programming as explained in [BNNS93] to compute the minimal models of the 
completion of the program. 
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Abstract. This paper is foeused on the objeet reeognition problem in 
computer vision under partial occlusion. The approach followed to 
carry out this goal is the alignment method described exhaustively in the 
literature. In this approach the recognition process is divided in two 
stages: in a first stage, the transformation in space between the viewed 
object and the model object is determined. In a second stage the model 
that best matches the viewed object is found. Given four points in the 
image, it is necessary to find the four corresponding points in the model. 
This problem involving combinatorial search is resolved by means of a 
genetic algorithm. The occlusion problem has been dealt with special 
attention, so a new method has been proposed consisting of three 
processes: identification, grouping and verification. The recognition 
algorithm proposed here has been tested in several examples obtaining 
good results. 



1 Introduction 

Object recognition is one of the most important aspects of visual perception. The 

problem of shape-based object recognition can be approached in different complexity 

levels, depending on the restrictions place upon the scene configuration: 

• The objects considered are flat and rigid, moving in the plane and scaling. 

• Flat objects which are not restricted to move in the plane, but are allowed to move 
and rotate in three-dimensional space. 

• Three-dimensional objects (rather than flat) in rigid transformations. This case can 
be further subdivided according to whether the visible contours are “sharp”, such 
as the edges of a cube, or smooth, such as the projected silhouette of a cylinder or 
a sphere. 

• Articulated objects, that is, objects containing movable parts, such as a pair of 
scissors or the human body. 

• Many real objects can undergo more complicated transformations, such as 
bending, stretching, and other types of distortions. 

This paper examines the case of flat objects in three-dimensional space with 

possible occlusions. A large number of different methods have been proposed in the 

Helder Coelho (Ed.): IBERAMIA’98, LNAI 1484, pp. 242-252, 1998. 
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literature, but the different approaehes can be classified into three main classes: 
recognition by invariant properties, recognition by object decomposition into parts, 
and alignment methods [1]. 

The first approach is not appropriated in the case of objects partially occluded. 
Invariant properties are in most of cases related to the global analysis of the object to 
recognize. Obviously in presence of occlusions it is impossible to compute the 
mentioned properties. 

Recognition by object decomposition into parts is related to a theory presented by 
Biederman [2], named Recognition by Components (RBC). The RBC is a theory of 
human image understanding from the field of psychology. Following this theory, the 
perceptual recognition of objects is considered to be a process in which the input 
image is segmented at regions of deep concavity into an arrangement of simple 
geometric components, such as blocks, cylinders, wedges, and cones. The 
fundamental assumption of the RBC theory is that a modest set of generalized-cone 
components, called geons can be derived from contrasts of five readily detectable 
properties of edges in a two-dimensional image: curvature, collinearity, symmetry, 
parallelism, and cotermination. The usefulness of this theory lies in the assumption 
that the detection of this properties is generally invariant over viewing position, and 
consequently, robust object perception is carried out when the image is projected from 
a new point of view. In the case presented here, where we are considering only flat 
objects, the image segmentation takes place in the T-junction points formed by 
contour occlusions. 

Some authors propose the contour segmentation at negative curvature minima 
previous to the recognition process [3], and others segment the curves in concave- 
convex segments [4]. These approaches have not been taken into account since we 
are considering a general contour without any restrictions, which may present no 
negative curvature minima or inflexion points. Moreover, curvature maxima are not 
invariant points under projective transformations, although in most cases they are. 

Alignment methods are other approach to visual object recognition [5]. In this 
approach the recognition process is divided into two stages. The first one determines 
the transformation in space that is necessary to bring the viewed object into alignment 
with possible object models. The second stage determines the model that best 
matches the viewed object. 

The approach presented here is based on the image segmentation into parts by the 
localization of the T-junction points. The new segments so generated, will be 
identified following an alignment method under consideration of projective 
transformation. 

In the following section we discuss this approach in more detail, reviewing the 
main approaches implemented up to date. In section 3 the shape matching algorithm 
proposed by us, and based in Genetic Algorithms is presented, being in section 4 
where the occlusion problem is taken into account. Section 5 shows two experiments 
of object recognition in presence of occlusion based in the proposed method and a 
discussion of the results. Finally, in section 6 we present our conclusions. 
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2 The Alignment Approach 

If V denotes the object view which is under analysis for recognition, Mi is a object 
model in the database, and Tij is the set of allowed transformations that can be applied 
to object model Mi, then the process of recognition requires to find the best 
transformation Tij that applied to the Mi model, maximize a given function F of fit 
quality between the model and the object. 

The alignment approach can be decomposed in two stages. In a first stage, the 
transformation between the object view and the model object, for all candidate 
models, is determined. Afterwards, the object model that best matches the object view 
is selected. The first step is known as alignment stage. The transformations allowed 
are projective transformations. 

Since the projective plane has three homogeneous coordinates, the transformation 
is represented by a 3x3 matrix with 8 essential parameters. The general projective 
transformation from one projective plane, IT, to another, tt, is represented as: 
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the coordinates before the transformation are represented in capital letters, while the 
coordinates after the transformation are represented in small letters. Cartesian 
coordinates are obtained from the previous expression according to the equations: 
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In the previous equations the parameter t33 does not affect neither x coordinate nor 
y coordinate, then it is considered as an arbitrary scale factor that does not affect the 
Cartesian coordinates, so we can choose t33=l. 

The projective transformation matrix T requires eight independent parameters to 
define a unique mapping. Since each point in the plane provides two Cartesian 
coordinate equations, it is necessary to find four point correspondences, provided that 
no three of them are collinear, between two projectively transformed planes to define 
the transformation matrix uniquely. 

The resulting linear system of equations is: 
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Ayache and Faugeras [6] present an method to identify and locate objects lying on 
a flat surface. This approach known as HYPER analyzes real scenes with randomly 
oriented and partially occulted flat industrial parts. The model position is defined by a 
transformation T, which takes into account a rotation in the plane, a scaling and a 
translation. The description of the model and the scene primitives is carried out by 
approximating the contour by polygons, more exactly, by a set of linear segments, 
where every segment is described by four parameters, x, y, 1 and a, where x and y are 
the coordinates of the segment midpoint, 1 is the segment length and a is the segment 
orientation measured relatively to the horizontal axis. The hypotheses generation is 
based in locally compatibility defined as angles difference and length of segments 
once the transformation T has been estimated. The evaluation process takes place by 
updating the model position following a recursive least square technique (Kalman 
filter) in order to update the estimate transformation T. 

The main difference in relation to our proposal lies in the paradigm employed to 
set up the hypotheses generation and evaluation. Ayache and Faugeras use a linear 
approach due to the restrictions place upon the scene configuration, i.e., rotation in the 
plane of the image. In our case, it is not possible to follow a linear method, as the 
previous one, because of the non-linear characteristic of our approach, rotation of the 
object in 3D space. So, we employ a paradigm based on genetic algorithms to solve 
the problem of searching the optimun solution in non-linear processes. 

Zisserman et al. [7] obtain four distinguished points based on properties that are 
preserved under projection, such as incidence properties (like tangency and points of 
tangency). This approach is not useful in the case of occlusion because the points 
obtained from the input image following the procedure described by Zisserman, are 
not equivalent to those obtained by the same method in the model view. 

Huttenlocher and Ullman [8] implement an application based on the alignment 
approach in order to recognize flat objects, like rigid machine parts that were allowed 
to translate, rotate in space and change of scale. The recognition system identifies a 
small number of salient and stable points such as strong maxima in curvature, deep 
concavities and the centres of closed or almost closed blobs. The method proposed by 
Huttenlocher and Ullman is based on three points to determinate the transformation 
that have been carried out by specifying six parameters: three for rotation, two for the 
translation (under orthographic projection) and one for scaling. In this paper, we are 
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considering four points because the transformations allowed are not restricted to be 
orthographic, but they can be projective. 

In the case of projective transformations, it is difficult to identify stable points. For 
example, maxima in curvature points used by Huttenlocher and Ullman are not in 
general stable ones. For instance, a circle can be viewed as an ellipse with two 
maxima of curvature or vice versa, an ellipse can be transformed in a circle losing 
these extreme points. On the other hand, the most important difference lies in the 
possibility of occlusion by other objects. 

If we consider visual recognition as a problem involving search in a large space, 
i.e., given a viewed object, the best match is sought in the space of all stored object 
models and all of their possible views, a method that resolve combinatorial 
optimization must be used in order to obtain a solution in a reasonable time. In this 
work we proposed the used of Genetic Algorithms as the paradigm to follow. 



3 Genetic Algorithms for Shape Matching 

First proposed by John Holland in 1975, genetic algorithms are computational models 
based on the mechanics of natural selection and natural genetics [9]. They provide a 
powerful paradigm in order to solve combinatorial optimization problems. As opposed 
to other optimization techniques which work with a single point in the search space, 
the genetic algorithm maintains a large population of configurations and combs the 
search from a multitude of points. 

In the present decade, genetic algorithms have experimented a widespread use as 
optimization techniques to solve problems in a wide variety of domains, including 
structural shape matching [10]. In this paper we are concerned with a genetic 
algorithm (GA) for the problem of finding the best matching between a candidate 
shape and its corresponding model in a data base. 

Basically, the so called simple genetic algorithm has been implemented, although a 
scaling mechanisms has been carried out in order to avoid premature convergence, 
and dynamic techniques are used [11]. The process consists in the selection of four 
no-collinear points in the input image and the localization of these points in the 
different object models. 

The structure of the GA implemented is as follows: 

program Genetic (MAX) 
initialization; 
evaluation; 
scaling; 
repeat 

generation; 

evaluation; 

scaling; 

until number of generations = MAX; 

end . 



The initialization function creates and initializes a population. 
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The evaluation function computes the fitness of the individuals in a population. It 
consists of the following steps: 

Decoding the binary strings to obtain the corresponding values in pixels. 

Calculation of the transformation matrix following the process described in the 
previous section. 

Application of the transformation matrix to the model. 

Arrangement of the pixels in the input image under consideration and the model in 
order to compare both. 

Normalization process in order to compensate for the transformations prior to 
comparing the viewed object with potential models. 

Fitness function evaluation. 

A simple matching measure similar to the Hamming distance used by other authors 
like Ullman has not been followed because of the inherent inaccurate of the process. 
Instead of the Hamming distance we have used a fitness function evaluation that takes 
into account the correlation of the x’s and y’s co-ordinates between the observation 
and the model, as well as, the mean of the Euclidean distance. This fitness function 
has the following expression: 

fit = 0.2 • corrcoef (x) • 0.2 • corrcoef (y) + 0.6 mean 

( 4 ) 



Where de corrcoef(x) is a matrix of correlation coefficients formed from array x 
Scaling of fitness values involves readjustment of string fitness values in order to 
avoid premature convergence. A linear scaling has been implemented to ensure that 
the maximum number of offspring allocated to a string is 2. 

The generation function creates a new offspring population according to: 

• Selection of the parents. 

• Perform crossover and mutation. 

• Generation of a new population. 

• Encoding the solutions as binary strings. 

The previous algorithm has been implemented in MATLAB and set up in a PC 
based on a PENTIUM processor at I50Mhz. The computational cost C of the 
algorithm (expressed in seconds) is given by de equation 5 where the linear behavior 
of all variables is stated. 



C = 10 • (6.45 -Z? - m + 2.5-m + 119-m-x + ^- [7.4 bm + 2.5m + \\9m-j 



( 5 ) 



where the parameters f, n, b, m and x are defined as: 
f = number of objects, 
n = number of generations, 
b = number of bits in the codification, 
m = number of individuals in the population. 

X = number of pixels of the contour to approximate. 
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The genetic parameters as crossover probability, mutation probability, population 
size and number of generations have been chosen experimentally. The population has 
a size of 100 individuals, i.e., string of bits, the crossover probability is equal to 0.6 
and the mutation probability is 0.003. Just only 20 generations has been found 
sufficient to reach a good solution. 



4 The Problem of Occlusion 

In this point we regard with the problem of partial descriptions of the data and cope 
with the problem of matching partial descriptions to complete descriptions of the 
models. By partial data we refer to curve segments which are bounded by a pair of T- 
junction points. The method proposed consists of three processes: identification, 
grouping and verification. During the identification process, isolated curve segments 
extracted from the input image are compared and associated with features of the 
object models. The final result is a list of relations between every curve segment in 
the input image and the associated models which have been identified as possible 
solutions. The figure 1 shows a list of segments corresponding to occluded contours 
and pointers to models they may belong to, that we have called identification tree. 




Models 



Segments 



Fig. 1. Identification tree. 



Once all segments have been classified, a grouping process is carried out. In this 
process open isolated segments that may belong to the same model object are grouped 
in a new segment hypothesis. Finally, a verification of the groupings formed in this 
way takes place. This verification consists in the simultaneous recognition of both 
segments by means of the estimation of the transformation matrix for the grouping, 
and the evaluation of the fitness function. 
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5 Results and Discussion 

The figure 2 shows the model database used in order to test the proposed algorithms. 



OsH 

(1) (2) (3) 




(4) 




( 6 ) 



n ^ 





(7) (8) 



(9) (10) (11) (12) 



Fig. 2. Model data base. 



As mentioned previously, the alignment method eonsiders four points in the 
oeeluded view of the object, two of them being the T-junction points formed due to 
the occlusion of the contour, and determine the transformation matrix for every model. 
The matching process takes place only in the piece of the model contour between the 
equivalent points to the occlusion points in the input image. 

As an example, the figure 3 A shows in fine line, the view of an object partially 
occluded. Once the proposed method has been applied, it has been found that the 
view displayed in figure 3B represents the best solution of all presented in the model 
data base. The figure 3C corresponds to the piece of the model contour that 
corresponds to the view under consideration. Finally, the figure 3 A shows in dashed 
line the best transformation given by the GA application that matches very closely. 





Fig. 3. Example of occlusion. 
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In the previous example, the reeognition proeess took place without any problem 
since the piece of the original contour exhibited sufficient and significant information 
in order to determinate the accurate model with no ambiguity. However, in more 
realistic examples, we can find a lack of precision in the recognition process since it is 
very common to find several acceptable models corresponding to the input image, 
each with different transformation matrix. This is due to the great flexibility exhibited 
by the projective transformation to change the shape of the objects. The figure 4 is an 
example of this. In the left side a view painted in fine line has been obtained from the 
model, in thick line, just applying the following transformation: 

X = 2X + 0.2Y + 600 
y = 2Y-0.5X + 400 

If we consider now this view occluded by another object, as represented on the 
right side of the same image, it results in two isolated contour pieces labelled A and B. 



2000 




-500 0 500 1000 1500 2000 

Fig. 4. An occlusion generating two isolated pieces. 



As we have referred in the previous section, the procedure begins identifying the 
models in the data base at which every one of the non occluded piece of the contour 
can belong. In the present example, it has been found that the piece A can correspond 
to the model objects 2, 3, 4, 8, 10, 11 and 12. On the other hand, the piece B can 
correspond to the models objects 1, 3, 4, 9, 10, 11 and 12. There are five common 
solutions that have to be verified independently. Four of them, the corresponding to 
the objects 3, 4, 10 and 12 are quickly rejected since there are pixels in one of the 
pieces, A or B, that coincide with the pixels in the other piece, B or A. So, there is 
only one solution left that has to be verified following the described process, i. e., 
given the four occlusion points, the transformation matrix is calculated and the fitness 
function is computed for this solution. 
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The final result is depicted in the figure 5. In 5 A both pieces of the contour are 
shown, as well as the same pieces, in dashed line, computed according to the 
transformation matrix found. In 5B the respective pieces are painted in thick line over 
the model identified. 




Fig. 5. Recognition of isolated pieces of the contour partially occluded. 



The process of verification carried out only with two isolated pieces of contour can 
be directly generalized with an arbitrary number of pieces. 



6 Conclusions 

This paper is focused on the object recognition problem, more precisely, on the 
recognition of partially occluded flat objects. This issue is very important in industrial 
applications where we can find flat objects in 3-D space under possible occlusions. 
The method developed in this work is based on two different approaches that together 
provide a powerful framework to solve this problem. These approaches are 
recognition by object decomposition into parts, and alignment methods. 

The alignment approach has been carried out taking four non-collinear points in the 
input image and looking for the corresponding points in every model in the image 
database. Two of the alignment points correspond to the T-junction points between 
the contour of the object under consideration and the contour of the object or objects 
occluding the previous one. 

Due to the huge flexibility exhibited by the projective transformation, we have 
found a problem in the frequent lack of uniqueness in the identification of an occluded 
contour. This leads to a multiple identification of models. Anyway, this problem has 
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been solved in part thanks to the simultaneous identifieation of several oceluded 
contours by means of the so called identification tree proposed in this work. 

The results provided by the present algorithm applied to a limited image database 
have been completely satisfactory, so we are encouraged to follow this line by testing 
the proposed approach in the recognition of flat objects in real images. 

The main problem in this approach is the computational time necessary to get a 
solution due to the kind of transformations allowed and the lack of restrictions 
imposed in the solution of the problem. Anyway, we have verified the lineal condition 
of the considered approach in reference to the number of models presented in the 
model data base. For a realistic implementation we are considering the 
implementation of the algorithms in a massively parallel architecture that takes into 
account the inherent parallelism exhibited by the GA. 
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Abstract. Photometric stereo (PS) images are obtained from a single 
camera under different illuminations, and have traditionally been em- 
ployed for the estimation of surface gradient in computer vision. Re- 
cently, it has been suggested that such images can also be matched to 
yield relevant depth cues both in human and in artificial vision. Here, we 
analyse the optical flow information carried by pairs of PS images taken 
under slightly different illuminations: By modelling the displacement of 
the irradiance pattern over the imaged surfaces, due to the illumination 
change, as a rotation plus a small translational correcting factor, we are 
able to obtain good 3D surface reconstructions through a structure-from- 
motion least-squares approach. Our framework for shape estimation does 
not require knowledge of the reflectance map function, which is at the 
core of the traditional approach to photometric stereo. 



1 Introduction 

Photometric stereo (PS) images are obtained from a single camera under dif- 
ferent illuminations, and have traditionally been employed for the estimation of 
surface gradient in computer vision [1,2]. Recently, it has been suggested that 
such images can also be matched to yield relevant depth cues both in human 
and in artificial vision. In [3], for instance, a relation was found between the 
disparity field obtained by matching PS images along a fixed direction, and the 
shape of the imaged surface. Here we extend that approach, by considering the 
shape information carried by the optical flow associated with a pair of PS images 
captured under slightly different illuminations. Such optical flow, which results 
from the displacement of the irradiance pattern over the imaged surface, due to 
the change in illumination, can be related to the surface function if a plausible 
model is assumed for the underlying movement. As we show in the following 
section, essentially the same relation as in [3] can be obtained when we model 
such movement as a rotation plus a small translational component along the 
optical-axis direction. Our approach has the advantage of being completely in- 
dependent of the reflectance map function, which is the basis for the traditional 
PS estimation and also for the work in [3]. We present reconstructions yielded 
by our optical- flow approach to photometric stereo (OFPS), both for synthetic 
and for real images of surfaces with lambertian plus quasi-specular reflectance. 
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2 Photometric Stereo and Optical Flow 

Let us consider a pair of photometric stereo images, Ii and I 2 . It is known that 
the image intensities can be expressed as [4] 

Ii{x,y) = Ri{p,q), i = l,2 (1) 

where P = ff and q = ^ are the gradient components of the imaged surface, 
z{x^y)^ and where the reflectance map functions Ri and R 2 correspond to the 
two illumination directions. Si and S 2 . If those directions are not far apart, and 
if the surface is smooth, we can attempt to match the two images to obtain a 
disparity map, D{s) = {Dx{s), L)y(s)), such that h{s) 12(5 + D{s)) at each 
image point, s = {x^y). Employing a Taylor-series expansion on the right-hand 
side of such equation, we And that the PS disparity map satisfles the relation 

+ ( 2 ) 

which is the standard optical-flow constraint equation, with AI{s) = Ii{s) — 
l 2 {s) playing the role of the time- derivative of the image intensity, and with the 
disparity vector playing the role of the flow velocity [5]. 

The PS disparity held - or optical flow fleld -, D{s)^ results from the displace- 
ment of the irradiance pattern over the imaged surface, and carries information 
about its shape. Such information can be recovered via a structure-from-motion 
scheme, if some assumption is made about the underlying movement. We pro- 
pose to model such movement essentially as a rotation, allowing for a correction 
factor in the form of an arbitrary translation along the z— axis (optical axis di- 
rection). As it is known, under orthographic projection such a translation does 
not affect the optical flow (see Section 4). 

Calling O = (A, 5, C) the rotation vector, and E(x, y) the translation along 
the z— direction, the equation of motion for the irradiance pattern, given in terms 
of a coordinate system flxed with respect to the camera, becomes 

AR = 0 X R^V{x,y) (3) 

where AR is the displacement of an inflnitesimal irradiance patch initially lo- 
cated at point R = (x, y, z) in the scene. From (3), we obtain the equations 

Ax = Dx = Bz — Cy, Ay = Dy = Cx — Az, Az = V (x, y) ^ Ay — Bx (4) 

where the first two identities follow from the assumption of orthographic projec- 
tion. 

The first two equations in (4) relate the observed optical flow to the surface 
function z{x, y) and to the rotation components A, B and C. The third relation, 
on the other hand, involves the unobservable translational component, V{x^y)^ 
and takes part in a further constraining relation: since we are assuming that 
the irradiance pattern moves, due to the illumination change, as if sliding over 
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the imaged surface, we must impose the condition that the displacement vector, 
AR^ should be, at each point, perpendicular to the local surface normal, i.e.. 



AR.fi = 0 



( 5 ) 



where 

(-P, -q, 1) 

n = , = 

is the unit normal vector. Thus, from (5), and using the third relation in (4), we 
easily obtain the constraint equation 



V {x, y) = DxV + DyQ + Bx - Ay 



(7) 



The first two relations in (4), along with equation (7), form the basis of 
our shape estimation strategy: we will try to determine the surface function 
and the rotation parameters which best satisfy the former, while minimizing the 
translational factor, V{x, y)^ in the latter. We propose a least-squares estimation 
from the functionals 







[Dx + Cy- Bzf + [Dy - 



Cx + Az]^dxdy 



(8) 



and 



*^2 = 



[DxP + Dyq + Bx — Ay]^dxdy 



(9) 



which come from (4) and (7), respectively. 

Minimization of the integrand in (8) with respect to z immediately yields 



z{x,y) 



BDx — ADy + C{By + Ax) 
A 2 +^2 



( 10 ) 



Employing (10) back in (8) and minimizing with respect to A, we obtain 





C' 

II 


(11) 


with 


(a ± + 4/3^ 

2/3 


(12) 


where 


a = J J [{Dx + Cy)^ - {Dy - Cx)‘^]dxdy 


(13) 


and 


= 1 1 [{Dx + Cy){DY - Cx)]dxdy 


(14) 


Equation (10) may therefore be rewritten as 






A'-c.y) = BX..,) = 


(15) 
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giving an estimate of the imaged surface, up to a multiplicative constant, in 
terms only of the photometric disparity field and the rotation component C. 
As such component cannot be obtained independently of z\ we then proceed 
as follows: assuming C = 0 in equations (12) to (15), we arrive at an initial 
estimate for z’ . From this, we may update the C value: assuming z{x^y) fixed, 
and minimizing with respect to C, we get 

p _ / + y) + Pyx - Dxy]dxdy 

J J {x^ + y‘^)dxdy 



Plugged back into equations ( 12 ) to (14), this in turn gives a new estimate for 7 , 
and thus for z' through (15). The whole process can then be repeated, and has 
been found to converge in a few iterations. 

Now, we can also estimate B through our second functional: rewriting ^ 2 , in 
equation (9), in terms of p' = ^ and q' = and minimizing the resulting 
expression with respect to B, yields 

p4 / RDxp' + Dyq'fdxdy 
f fi'yy- xYdxdy 



which completes our estimation of z{x^y) = z'{x^y)/B. 

We obtain two pairs of maps for z, due to the double value for 7 in (12), 
and to the plus and minus signs for B in (17). One of these maps is usually a 
plausible reconstruction of the imaged surface. 

It is interesting to remark that, in such reconstruction, the reflectance map 
functions which are at the core of the traditional approach to photometric 
stereo [ 1 , 2 ], play no part. Also, with an optical flow constraint slightly modified, 
for taking into account the fact that there may not be conservation of intensities 
if the reflecting properties of the surfaces are non-uniform, our approach to PS 
has yielded good reconstructions for surfaces with a position-dependent albedo. 
In such cases, we based the optical flow estimation on a constraint of the form 



atAI{s) 



dh 



dh 



a^Dx{s)^ + ayDyis) — 



(18) 



where ^ 

^ jj/Xlrnax ^ 

for i = x^y OT t. 

In (19), and denote the maximum and the average values, 

in a small window centered on site 5 , of the absolute intensity differences, 
along the x^ y and t dimensions, of the PS image pair. Thus, the a^s essen- 
tially modulate the values of the associated derivatives in (18) - recall that 
AI{s) = h{s) — his) plays the role of a time derivative - by the reciprocal of a 
measure of the intensity variations in the neighborhood of the considered point, 
thereby somewhat shielding the optical flow estimates from the influence of the 
inhomogeneities in surface albedo. The form of the a^’s, with 77 a free parameter, 
has been chosen so that (18) remains invariant under linear transformations of 
the input images. 
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3 Experiments 

Figures 1 to 3 show results of surface estimation experiments with our opti- 
cal flow approach to photometric stereo. Each figure depicts a pair of PS input 
images and the reconstructed surface. In all the experiments, the illumination 
vectors were of the form S = (sin cr, 0, cos cr); the flow field, D{s)^ was obtained 
through Horn and Schunck’s iterative algorithm [5] and then employed in equa- 
tion (10) - with A, B and C estimated as described - to yield z{x^y). 

In the first experiment, the inputs were four pairs of synthetic images of a 1am- 
bertian ellipsoid, obtained for the illumination directions (a values) (—30^, — 10^), 
(—10^,0^), (0^, 10"^) and (10^,30^), and D{s) was taken as the sum of the flow 
fields resulting for each image pair. 

In the second experiment, we considered a sphere with lambertian plus quasi- 
specular reflectance, and non-uniform albedo (the central stripe has lower albedo 
than the rest of the surface) . A single synthetic image pair was used (illumination 
directions —30*^ and 30^), and the modified constraint form (18) was employed, 
instead of (2), in Horn and Schunck’s optical flow estimation algorithm. 

Lastly, the third experiment deals with reconstructing the shape of a real 
vase, of approximately lambertian reflectance, from two input image pairs, ob- 
tained for the illumination directions ( — 10^,0^) and (0^, 10^). 

4 Discussion 

We have introduced a process of shape estimation from the optical flow associ- 
ated with pairs of photometric stereo images obtained under slightly different 
illuminations. Such flow results from the displacement of the irradiance pattern 
over the scene, which we have modelled as a rigid-body rotation coupled with a 
small position-dependent translation along the optical-axis direction. This choice 
has been motivated by an analogy with a situation where the illumination is kept 
fixed, but the observer rotates about the scene. For a surface symmetrical about 
the rotation axis, such movement would give rise to the same optical flow field 
as would result, for a fixed observer, from an equal rotation of the illumination 
source in the opposite direction. Since a rotation of the observer under fixed 
illumination will also be, in this case, equivalent to an opposite rotation of the 
surface, along with its initial irradiance pattern (as if the irradiance pattern had 
been painted on the imaged surface), we may conclude that, for such symmet- 
rical surfaces, the optical flow resulting from a change of illumination can be 
exactly described as arising from a rotation of the 3-D irradiance pattern on the 
scene. 

This is certainly not true for a general asymmetrical surface, but when the 
change in illumination direction is small and the surface is smooth, we have found 
that we are still able to model the irradiance pattern movement as a rotation, 
provided that we allow for a correction factor in terms of a position-dependent 
translation along the optical-axis direction (direction z). As it is known, under 
orthographic projection any such translation does not contribute to the optical 
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flow; therefore, for any assumed rotation, there will be an infinite number of 
surfaces, differing by the translational factor, which will be equally consistent 
with the observed irradiance displacement field. The translational factor thus 
represents an extra degree of freedom, which allows us to consider the estimation 
of a great variety of shapes through a very simple model: given the optical flow of 
the PS image pairs, we try to determine the surface rotation which best explains 
such flow and which, at the same time, minimizes the unobservable translational 
factor along 2 :. 

Such framework has led to an expression for the surface function in terms 
only of the parameters of the movement, which are estimated via a least-squares 
approach. It is interesting to compare this expression (equation (10)) with the 
corresponding relation obtained when matching the PS images along a fixed 
direction, as in [3]. There, the surface function is recovered as 

+ k 2 DY)h{s) - ko{kix + k 2 v) + F{k 2 X - kiy) 
y>— ,2 I 7.2 

iv-^ rv<2 

where A:o, ki and /c 2 are the linear coefficients of the reflectance map for the 
difference image, Ii — / 2 , and F is an arbitrary function. It is easy to see that 
the C-factor in (10) corresponds to the function F in (20). Moreover, if we take 
as our surface estimate, instead of the z in (10), a z given by 

z{x,y) = z{x,y) + — (21) 

(which amounts to taking, as the depth value at the point (x^y)^ the average of 
the estimates at that point and at the matched point (x + Dx^ y + Dy))^ there 
results, by neglecting the factor R(x,^), essentially the same functional relation 
as (10), i.e.. 



z{x,y) 



BD^ - ADy + {A^ + B^){Ay - Bx ) /2 + C{By + Ax) 

v42 + ^2 



( 22 ) 
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Abstract. As virtual reality evolves towards more natural interfaces, 
new contact less interaction based on gesture recognition is unfolding. 
This interaction is supported on geometric, dynamic and cognitive mod- 
elling of gestures. As well as other branches of artificial intelligence, com- 
puter vision plays an important role in this modelling. 

The purpose of this paper is to describe how computer vision is helping 
to develop virtual reality and present some interfaces developed in our 
laboratory. 



1 Introduction 

During 1945, Vannevar Bush conceived the use of computers beyond calculation 
and thought about them as a fundamental tool for transforming human thought 
and creative activity [2]. He anticipated the use of computers for multimedia 
processing. 

Since Bush’s time many technological breakthroughs have occured, but com- 
puters are still limited in their multimedia understanding [18]. This means that 
we have increased the effective bandwidth of information from computers to hu- 
mans, by sending audio, images, audio, graphics, haptic data, but the same rate 
of improvement has not happened in computer understanding. Most computers 
still receive input from low bandwidth devices like keyboards or mouse. Only few 
interfaces are able to understand application related domains of audio, visual or 
haptic information [7]. 

Several researchers have identified this unbalance and are working on more 
intuitive interfaces like virtual reality, speech recognition, image understanding 
and multimodal interfaces [3,18]. In the rest of this paper we will concentrate on 
image understanding techniques which are relevant for virtual reality. 

The main conceptual components of a virtual reality system are: a) inmmer- 
sion^ the ability to experience a 3D world as a reality, b) viewpoint^ refers to the 
point of observation of the user, c) navigation^ which allows to change viewpoint 
and d) manipulation^ the capacity to interact and change the relative position 
of objects in the environment [14]. 

Another important component of virtual reality (VR) is tracking^ since it 
helps to establish the correct position of the user in the environment, and there- 
fore provides the appropriate visualization and interaction [36]. There are many 
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technologies which help in tracking body part and they are based on electric, 
magnetic, ultrasonic, infrared, optic or image analysis principles [3]. In this pa- 
per, we are interested in this last type of tracking. 

Since visual perception allows many organisms to interact successfully with 
their surroundings it is natural to think that computer vision (CV) might help 
to bring closer people and computers. This interaction would be based on the 
interpretation of body language, through gesture recognition. This new research 
topic is bringing together people with backgrounds on computer vision, human 
computer interaction, psychology, and artificial intelligence. 

In the rest of this paper we will concentrate on gesture recognition as it ap- 
plies to human computer interaction in general, and virtual reality in particular. 
Also, we will present a general methodology which emerges from related works 
worldwide. Finally, we will describe the main interfaces developed in our lab, 
and how they compare to other similar systems. 

2 Computer Vision and Virtual Reality 

At first glance, computer vision and virtual reality do not seem related, in fact 
they look opposite. CV attempts to reconstruct surfaces, recognize objects and 
provide motion information of objects from images. That is, it goes from images 
to objects. In contrast, computer graphics and VR work from object models to 
images. So in certain sense, they are complementary [22]. However, the bound- 
ary between these areas is becoming thinner as many people realize that the 
solution to key problems in both areas involve a close interaction between the 
physics of image formation, geometric and mechanical modelling of object shape, 
deformation and motion [33,34,8]. As well as cognitive modelling of actions and 
behaviour to express agents or organism responses to visual stimuli [9]. 

Other problems that are of special interest to both fields but with with differ- 
ent points of view are stereoscopic vision and object tracking. In the case of CV 
the problem of stereopsis is object reconstruction from two images of the object. 
For VR, the problem is how to obtain two different images of a scene composed 
of objects, so they are appropriate for visualization in devices like head mounted 
displays. The common link is that of stereoscopic visualization as achieved by 
the human brain [16]. 

The problem of object tracking is very important for both areas, because in 
the case of VR, helps to track body parts to provide the appropriate feedback of 
change in position of the user or objects in the virtual environments. Also, real 
time tracking is vital for proper visualization and reducing the so called “lag 
time” [36]. 

In the case of computer vision, object tracking is also very important in 
the context of time varying imagery, real time vision systems and robotic sys- 
tems with visual capacity [8]. In addition, object tracking is very important for 
organism to follow predators and prey. 

Recently, CV and VR have come closer as many people realize that more 
natural interfaces can be built by using CV to provide object tracking. That is 
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following body parts or providing gesture recognition, without having to wear 
any special equipment. This has the advantage to provide more freedom of move- 
ment and makes the computer to adapt to humans, rather that the other way 
around. 

Another close interaction between CV and VR has come from what is called 
augmented reality [1]. The idea is to register in a common environment, real 
and virtual objects. That is, in these systems it is possible to see the real world 
augmented with computer generated objects. For example, a surgeon during an 
operation will be able to see an MRI volume superimposed with the images of 
the actual patient [13]. 

CV is also helping to animate VR specially when the environment incorpo- 
rates autonomous agents. Some simulation programs of artificial life use synthetic 
vision as a sensor for organisms, so they can react to the presence of other beings, 
for instance approaching or receding [30]. 



2.1 Related Works 

We are interested in research works which use computer vision as a general hu- 
man computer interface. The more general context in which we find this concept 
is in what is called smart rooms and smart clothes [23]. The idea is to have 
many cameras and computers in a network which are continuously analysing the 
images of people. These cameras can be in different places in a room, street, or 
they can be attached to human clothes. As a result of the analysis, people can 
be recognized, tracked, or communicate with computers and decisions can be 
made. 

Other works are more specific and involve face recognition [31], emotions 
recognition [23], sign language understanding [32], teleoperation in virtual reality 
robotics environments [20], tracking of the whole human body for surveillance ap- 
plications [5], hand tracking [10,25,15], iris tracking and recogni- 
tion [41,17,27,24,40] and head tracking for general human computer interac- 
tion [26,4,39]. 



2.2 Framework to Relate VR and CV 

From the analysis of works which involve CV and human computer interaction, 
we have obtained a framework which helps to understand previous work, and 
also as a guide to develop new interfaces (Figure 1). 

In computer graphics and virtual reality we have a database of 3-D mod- 
els which are used for rendering and visualization. In these systems, the user 
navigate or interacts with the environment through a graphical user interface 
(GUI), or directly with special hardware (data glove, HMD, etc.). In any case, 
the actions of the user modify the database of graphics objects and this in turn 
changes the visualization. 

What is relatively new in VR, is telemanipulation and augmented reality 
where the graphic objects in the database have some correspondence with ac- 
tual objects in the real world. In this case, by moving one graphics object, for 
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instance a 3-D model of a robot arm, it is possible to move a real robot, possibly 
in a distant location [20]. Also, in augmented reality the correspondence and reg- 
istration is more critical, since virtual and real object have to appear as sharing 
the same space [1]. Vision techniques have an important role in helping to pro- 
vide proper registration and in maintaing the consistency and correspondence 
of real and virtual worlds [20,13]. 




In the case of direct, contactless, human computer interaction through ges- 
ture recognition, the vision systems plays a more active role, since it allows the 
user to communicate with the computer, without having to wear any special 
hardware [23]. Since the system must have a representation of the gesture to 
recognize or track, we can think this representation as geometric and dynamic 
model information that fetched by the model based vision module guides its 
search through the image data. This model can be implicit in the system or 
explicitly represented and could be stored in a 2-D or 3-D model database. 

The images are analyzed and features and external forces are extracted which 
drive model matching, and the vision module updates the model parameters and 
dynamics for the best fit [6,33,34,8]. 
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3 Trackers Developed in Our Laboratory 

As a result of our literature review we have defined the objective of our research 
group as the study of computerized visual perception as a mean of interaction 
between human and computers, and robots or agents (bots) with real or virtual 
environments. Until now, our work has been concentrated on developing human 
computer interfaces through computer vision in general, and in particular for 
virtual reality systems (Figure 2). Our graphics interfaces are original and also 
some of the tracking methods that we have developed. In the next subsections 
we will describe the features of our iris, hand and glasses tracker, and the GUIs 
that have been implemented for manipulation and navigation in virtual reality 
systems. 




Fig. 2. Work environment 



3.1 Iris Tracker 

The novelty of our iris tracker is that we do not need to attach the camera to the 
user’s head, or provide the initial position of the head as in other works [17,41]. 
We do not provide the eyes rotation, but we are able to obtain the center, and 
radius of a pair of circles which are fitted to each iris [27], as is illustrated in 
Figure 3. Taking the mid point of the segment that joints the centers of the 
circles, we define a visual cursor that can be employed for activating buttons 
in GUIs. By using visual fixation, and if both iris remain in the same position 
in several frames (5 in our implementation), we interpret this behaviour as the 
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location of a point of interest, and we can associate it with the “click” of a 
button. 

Our iris detector works by applying edge detection to each frame, followed 
by thinning and search of circles using a Hough transform. The most promising 
pair of circles are selected by applying heuristics [27]. The performance of this 
tracker is between 3-5 frames/second running on a Indy, SGI workstation. For 
our algorithms to work properly, it is important to have a good illumination that 
generates good contrast in the images and the edges appear well defined. We have 
found this experimental condition as the most critical for our implementation. 




(a) (b) 



Fig. 3. Iris detector, (a) Edges on the image, (b) The mark represents the visual 
cursor which is defined by the position of the eyes. 



3.2 Hand Tracker 

Our approach for the hand tracker is not as powerful and general as in other 
works [10,25,15], but we provide simpler solutions for tracking a hand with ex- 
tended fingers [24,28]. 

Our method uses edge detection and frame differences to extract moving fea- 
tures with high gradients. Then, we apply a thinning algorithm, followed by a 
Hough transform to detect lines [12]. Finally, we extract the most meaningful 
segments which correspond with the fingers to obtain a global centroid from all 
the segments. This centroid provides a reference position for a cursor (Figure 4). 
The changing size of the segments provides cues about the proximity (approach- 
ing or receding) of the hand from the camera. The performance of the hand 
tracker is also between 3-5 frames/second on the same platform as described 
above. 



3.3 Glasses Tracker 

Some types of stereoscopic glasses do not provide head tracking, so we decided to 
develop one using computer vision. Since the model that we have (Crystal eyes. 
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Fig. 4. Hand detector. This image shows how to establish communication with 
the computer, using a cursor defined by the position of the segments that ap- 
proximate the fingers. 



Stereoscopic corp.) has a rectangular frame, the idea was to fit parallelograms 
to the different images. Since a rectangular shape can deform into a trapezoidal 
shape under perspective transformation, in this case we use an affine approxima- 
tion, an under these conditions a rectangle deforms into a parallelogram. For this 
method to work, the perspective effect should remain small, so it is important 
that the glasses do not get too close to the camera during side view. 

For each frame the method searches for high gradient points, and from them, 
a contour following algorithm is applied [12]. Only closed contours and those that 
satisfy a shape criteria are preserved. If two contours remain that approximate 
the shape of the glasses, these are fitted with an affine transformation using the 
method of normalization, provided with the DIAS software [35,37]. /,From the 
affine deformation of the rectangle into a parallelogram, it is possible to work 
out the 3-D rotation of the glasses in space [39]. The lines shown in Figure 5 are 
the projections of a spacial line segment in direction of the line of sight. 

Our implementation of this algorithm using the DIAS software [35], has a 
performance of 10 frames/sec using images of 160 x 120 pixels on a Pentium PC 
(200 MHz) [39]. We have not found similar works related to glasses tracking, but 
only to head tracking [4,26]. We think that our method can be more accurate for 
head tracking than those reported in the literature, because we track specific, 
well defined features (glasses in our case), but we cannot make definite claims 
until we perform a formal experimental analysis. 

4 Graphical User Interfaces Developed in Our Laboratory 

The graphical user interfaces that we have developed allow the manipulation 
of 2-D and 3-D objects with hand or eyes movements. It is possible to select 
different objects, change their orientation in space, bring closer or move away 
the object, navigate in a 3-D environment or change the relative position of 
objects [24]. The first interface is used to manipulate objects so that they move 
according to the movements of the user head (Figure 6). 
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Fig. 5. Determination of the line of sight from the parallelogram that fits the 
glasses. The image point corresponding to the center of the glasses is shifted into 
the center of the image. 




Fig. 6. The cube changes its position and orientation while the user head is 
moving. 
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Another graphical user interface works as a visualization tool for 3-D objects. 
It uses the cursor defined by the iris detector to activate some buttons which 
serve to select, rotate, or bring near or far an object (Figure 7). The iris must 
be detected at least five times to activate a button. 



g| Examinador de objetos en GL | q 




SELECCION DE FIGURflS 




Fig. 7. Examinator of 3-D objects. The button number 3 is selected. 



Figure 8 shows another interface. We can move in the virtual environment 
only by translations. They are in correspondence with the translations of the 
defined cursor with regard to its initial position. 

Besides of navigation, we implemented the selection of objects contained in 
the virtual environment (Figure 9). 

5 Conclusion and Future Work 

We have analyzed works in the literature that relate computer vision and virtual 
reality and as a result we have proposed a framework to understand present and 
future works. Also, we presented the contribution of our group in developing 
eyes, hand and glasses trackers and their graphical user interfaces. 

Our ongoing work in collaboration with the National Autonomous University 
of Mexico (UNAM) and the University of Houston- Downtown (UH-D) involves 
the creation of cooperative virtual environments with gesture recognition inter- 
faces and autonomous, intelligent agents, and will be reported in future academic 
events. 
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>j Navegacion en ef escenaiio 



Fig. 8. Navegation in a virtual environment by using the iris tracker 



Sefeccion de objetos 



Fig. 9. The frame box around the flag indicates that this object has been se- 
lected. 
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Abstract. Recently, research on mobile robotics has been focused on 
achieving reliable performance on autonomous systems. We believe that 
one possible way to do this is by using landmarks to bound uncertainty 
during the path planning and navigation stages. In this paper, we present 
an algorithm to compute the position of artificial visual landmarks in a 
mobile robot workspace. We aim to maximize the region in the workspace 
from where a landmark can be seen, ie., to define the position of the 
landmarks in the workspace. After pointing out that this problem is com- 
binatorial in nature, we propose a simulated annealing type of technique 
to find the optimal landmark arrangement. 



1 Introduction 

There is a growing interest to make autonomous robot technology available in 
current human environments. Nevertheless, this poses certain difficulties that 
have not been settled by our current state of knowledge in the areas of percep- 
tion, real-time control and reasoning [6]. It has been argued that it is because 
human environments carry on cultural information that provide humans with 
cues that facilitates navigation and goal execution. Researchers [2] have proposed 
enginnering the environment, via the use of artificial landmarks, to reduce the 
complexity of the model of the environment. In this way, a robot could use its 
computing resources to solve a commanded task within a required time frame. 
Nowdays, the use of artificial landmarks for navigation and self localization is 
widespread. This is why, we feel that an study of how to place them optimally 
in the robot workspace is worth pursuing. 

In this paper, we study the landmark placing problem, defined as computing 
the position of the landmarks in the workspace that maximize their joint cover- 
age. In a recent paper, Tashiro et al. [9] studied this problem using a signboard 
landmark made of four LEDs. In their approach, they have a model of the lo- 
calization error of the robot with respect to the landmark. Their objective is to 
minimize the maximum localization error. We feel that this approach may offer 
advantages when preattentive visual capabilities are available. 

Helder Coelho (Ed.): IBERAMIA’98, LNAI 1484, pp. 274-282, 1998. 
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The landmark placing problem may be seen as a generalization of the art 
gallery problem [8] where restrictions apply to the perceptual capabilities of 
the guards. Theory says that in two dimensions, [n/3j guards are sometimes 
sufficient and always necessary to cover a polygon of n sides. In this case, the 
assumption is that the guards have complete peripheric perception and can de- 
tect features in the environment that are either infinitely far or arbitrarily close. 
From another interesting perspective, the landmark placing problem can be re- 
formulated in terms of a sensor placement problem exchanging landmarks by 
sensors. Then, the problem may be to cover the maximum possible area with 
the minimum number of sensors. Clearly, the outcome depends on the model of 
the sensor used. Consider, for instance, Zhang’s [10] studies about how to place 
optimally accoustic transceivers in a two dimensional space. 

The use of artificial landmarks is widespread in problems such as navigation, 
localization and path planning. For instance, Bessiere, P., et al [3] and Lazanas 
and Latombe [6] describe methods for path planning based on the assumption 
that landmarks exist in the environment. We see our research on the landmark 
placing problem as complementary to these techniques. 

The problem of selecting an optimal arrangements among different possibili- 
ties is combinatorial in nature. In order to solve it, we use a simulated annealing 
algorithm [5]. In §2, we describe the landmark placing algorithm. Then in §3, we 
present experimental results. Finally, we conclude discussing the implications of 
our results and point out research directions. 




Fig. 1. Idealization of a landmark visibility region. A landmark on a wall at 
position r can be observed from a distance d < x < D spanned over an 
angle p. 
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(e) (f) (g) (h) 




(j) (k) (1) 




(m) (n) (o) (p) 

Fig. 2. Results of the algorithm for 1-14 landmarks in a workspace. 



2 Placing Landmarks 

Firstly, we define the constraints, suppositions and objective function, and then 
we propose a simulated annealing algorithm to find the optimal value of the 
objective function. 



2.1 Landmark Configurations 

Let us define the robot workspace W as a polygonal region in Suppose 
that the workspace contains a set B of polygonal obstacles. A landmark visi- 
bility region ZY(p) is a subset of W — 23 from where the robot can potentially 
detect a landmark in p. Suppose that the robot has a visual system such that 
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(a) Office Emulation 



(b) Visibility Regions 




(c) Robot enters a Room 



Fig. 3. We emulate an office-like environment with cardboard walls. Landmarks 
are placed at each gray box (a). The landmark visibility regions are shown in 
(b). In (c), the landmarks with ID number 0 and 1 are aside the doorway. 



once a landmark is detected it can compute some geometric information such 
as the position of the robot with respect to the landmark. The reliability of the 
collected information depend, among other things, on the relative positions of 
the robot and the landmark, the capabilities of the robot visual system, and 
the landmark structure. Loosely speaking, the geometry of lA is defined by the 
range of distances and the range of angles from where the visual system can 
extract information from a landmark. In our case, reliability may be thought 
as a property that tell us how close is a parameter to its true value. One may 
hypothetize that the visibility region for each parameter, and each landmark, 
is different because they are all based on different image analysis. For instance, 
the distance may be inferred from the apparent size of the projected landmark, 
while the orientation may be computed from the apparent image distortion of 
the landmark. In all cases, the error is different and as a consequence generate a 
different visibility region. Nevertheless, a visual system would have difficulties to 
extract useful information if it is too close or too far from the landmark center. 
Also, some problems will be experimented if the visual system line of sight is 
too deviated from the landmark normal. For simplicity, we idealize ^(p) as a 
section of a circular ring (see Fig 1) spanning an angle p G ((/>(u) — zA, ^(u) + A) 




278 Joaquin Salas and Jose Luis Gordillo 



r ^ 





(a) Layout 




(b) 10 landmarks (c) 15 landmarks 

Fig. 4. Layout and examples of landmark placing for a portion of our lab. 



and radius r G (d, D), where (/>(u) is the angular orientation of the vector u and 
the angle A and lengths d, D are parameters that characterize the visual sys- 
tem. Also, we assume that a landmark is a plannar patch that can be pasted on 
plannar walls. In order of taking into account the workspace boundaries and the 
existence of obstacles, we define the working visibility region V as the intersection 
of hi with the workspace minus the obstacles. In other words, 

V(p) = (WnW)-T(B,W(p)) (1) 

where T returns the region that contains all the points in the landmark’s visibility 
area U such that for the line that goes from one of these points to p there is at 
least one point contained in the obstacle B. That is, 

T{B, p) = {q I q G t G pq ^ t ^ B} (2) 

Given a certain workspace, we define a configuration V as the set of n landmarks 
placed on points {pi, . . . , p^} along the boundary of W or B. Under such condi- 
tions, the function to optimize is the area of the union of the working visibility 
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regions V(pi) for a given configuration V. 

TL Ti Ti — 1 71 

A(U^(pO) = Eb(V(p,))- 5^ E B(v(p.)nr(p,))+03() (3) 

7=1 7 = 1 7=1 J=7+l 

where O 3 O is the resulting area of intersecting three or more working visibility 
regions V(pa;) and B is a function that computes the area of a polygon. We will 
dismiss the term O 3 O assuming that the landmarks will be well spread and as a 
result the intersection of three or more working visibility areas will be small. The 
area of a polygon can be efficiently computed considering only the vertex points 
as follows. Let Al be a polygon with vertex {ai,a 2 , . . . ,a^} sorted in counter- 
clockwise order (also assume that ai = a^+i). A vertice a^ has coordinates 
<Xi,yi>. The area B{A) is, 



V 

B{A)= 1/2 E(y» + yi+l)(^i “ ^i+l) (4) 

7=1 

In a nutshell, given the size n = | P | of the landmark configuration, we look for 
the places pi, . . . , p^ that maximize the working visibility area (see Eq (3)). We 
use a simulated annealing algorithm to achieve this. 



2.2 Simulated Annealing 

Kirkpatrick et al [5] noted that ‘^There is a deep and useful conneetion between 
statistical mechanics and multivariate or combinatorial optimization. Statis- 
tical mechanics deals with the problem of interpreting the properties of large 
aggregations of atoms. One of its fundamental questions is what happens to the 
matter when the system reaches its limit low temperature. When the temper- 
ature is high, the atoms will present high mobility. As the temperature cools 
off, the atoms find arrangements where they still keep a certain kinetic energy. 
An appropriate annealing scheme will cause the aggregation of atoms to align in 
structures of low energy. A certain configuration can be weighted by its Boltz- 
mann probability factor, exp (^), where E is the energy at temperature T 
and k is Boltzmann’s constant that relates temperature with energy. Metropo- 
lis et al. [7] suggested the following annealing procedure. The particles move at 
random within a hypersphere of known radius centered at the current particle 
position. The change in the energy AE of the system is calculated. If AE < 0, 
then the system is in a state of lower energy and the configuration is accepted. 
If AE > 0 then the new configuration is accepted with probability exp ( ) • 
After a certain number Tc of changes have taken place or a certain number 
of trials are done the temperature is reduced by a factor AT. There is a finite 
number of temperature changes r^. 

In our case, the energy E and constant k are factors related to the size of 
the working visibility area of a certain configuration V. The initial temperature 
is set to one, T ^ 1 . At each temperature step the temperature is reduced by 
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a multiplicative factor At = 0.97, as T ^ TAt. There is a maximum of 100 
temperature steps r^. Au iuitial coufiguratiou V = {pi,...,Pn} is geuerated at 
raudom such that ^ pj for i j. The Metropolis loop for the laudmark 
placiug problem ruus as follows. A biuary vector C = {ci, . . . , c^} of chauges 
is computed at raudom. If = 1, it meaus that auother locatiou, selected at 
raudom, will be tried for the laudmark curreutly at p^. Otherwise the laudmark 
stays where it is. A uew coufiguratiou is computed at raudom takiug iuto accouut 
the values iu vector C aud makiug sure that p^ ^ pj for i ^ j. The curreut 
coufiguratiou V is evaluated usiug Eq (3), thus obtaiuiug the curreut objective 
fuuctiou value a{. If there is au improvemeut iu the objective fuuctiou, Le., ai > 
theu the uew coufiguratiou is accepted. If there is uo improvemeut theu 

the uew coufiguratiou is accepted with probability exp ^ • There are a 

uumber of trials at auy temperature step. The Metropolis scheme suggests to pass 
to the uext temperature step, Le., to reduce the temperature, wheu a maximum 
uumber of combiuatious is reached (curreutly 100 x u ) or wheu a uumber of 
successful chauges Tc ou the coufiguratiou have beeu applied (curreutly 10 x n). 
The algorithm passes to the uext temperature step aud the cycle is repeated 
uutil a certaiu maximum uumber of temperature steps are made. 

3 Experimental Results 

Iu Fig 2, we sketch the output of our placiug algorithm with a simulatiou where 
a polygoual workspace W that iucludes au obstacle B is used. The workiug visi- 
bility regiou V(pi) is computed wheu the algorithm selects to place a laudmark 
ou positiou p^. 

Our placiug algorithm cau be used iu coujuuctiou with path plauuiug algo- 
rithms that assume the existeuce of laudmarks by providiug the regious withiu 
the workspace from where seusiug iu possible. For iustauce, Lazauas aud 
Latombe [6] proposed a motiou plauuiug algorithm of polyuomial time com- 
plexity. The idea is that with the use of artificial laudmarks oue cau bouud the 
uucertaiuty iu the robot’s positiou. We built au emulatiou of au office-like euvi- 
roumeut with cardboard walls, the cardboard’s height is approximately 4 feet. 
Fig 3(a) shows the layout of the workspace. It also siguals the approximate posi- 
tiou aud orieutatiou of the 18 laudmarks placed. Iu Fig 3(c), we show a picture of 
the the robot wheu it is euteriug a room. The laudmarks are used to sigual special 
places iu the euviroumeut. For iustauce, we mark people’s desks, the fridge, the 
coffee maker machiue aud so ou. Our work aud Lazauas aud Latombe’s work are 
complemeutary. Our algorithm provides a certaiu laudmark coufiguratiou while 
Lazauas aud Latombe’s algorithm provides a path to go from the curreut to a 
goal positiou provided that the path exists. 

Iu a third experimeut, we ruu the laudmark placiug program iu a portiou of a 
map of our lab. Fig 4(a) shows the layout of a portiou of the robot workspace. The 
ceutral rectaugles correspoud to columus of the buildiug. We ruu our algorithm 
for laudmark placiug. Fig 4(b) shows the result wheu the uumber of laudmarks 
is 10. The algorithm reports that 51% of the area has beeu covered aud that 
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on the average each visibility region covers 96% of what it can cover, Le., they 
are well widespread. The landmark configuration is distributed mainly in the 
open area. Fig 4(c) shows the result when the number of landmarks is 15. The 
algorithm reports that 69.8% of the area has been covered and that on the 
average each visibility region covers 87.6% of what it can cover. At this moment 
one of the offices and the corridor starts being covered. 



Conclusion 



One of the main problems in visual perception for mobile robots is the efficient 
use of limited computing resources. Some arguments against the use of artifi- 
cial landmarks involve the high cost of engineering the environment and the 
disruption caused by the introduction of symbols in an otherwise harmonious 
decoration. However, we must point out that with few exceptions [4], the robots’ 
workspaces are primarily designed with humans in mind as the primary users. In 
this case, human environments convey a lot of cultural information [1] that peo- 
ple use as landmarks to move and localize themselves and it is still not obvious 
how this information will be given to the robots. 

This paper presents an algorithm for computing the place where landmarks 
should be placed to increase their joint coverage of the workspace. However, 
depending on the application on may need to increase the number of terms in 
Eq. (3) to obtain more accurate results, although at the expense of increasing 
computing time. In this document, we assumed a particular geometry for the 
visibility region Hi. However, the shape of Ui changes depending on the type 
of parameter that one needs to recover. An observation model for each type of 
parameter that is going to be recovered can be incorporated into the computation 
of the landmark placement. This model can take into account a particular shape 
and a probabilistic model for each type of parameter and each landmark. 

One may say that the use of artificial landmarks will become obsolete with 
the arrival of better computing technology. However, advances in computing 
technology must be accompanied by effective algorithmic development or other- 
wise they will be fruitless. Furthermore, we believe that there will be always a 
place for the study of symbols in the context of autonomous systems. A clear ex- 
ample: Although humans are a fairly sophisticated autonomous entities, symbol 
recognition plays an important role in our everyday lives, as for instance when 
we recognize traffic symbols on the streets. 
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Abstract. This paper describes a formal framework for the analysis of 
genetic algorithms. The model is based on the idea that over the space of 
populations an equivalence relation can be defined, as well as a metric on 
the space of equivalence classes induced by this relation. With this tools 
it can be proved that selection in a GA causes some kind of convergence 
and entropy reduction. The model is not restricted to a particular kind 
of selection. 

Keywords: genetic algorithms, Hamming distance, partition, poset, 
metric space, convergence, entropy. 



1 Introduction 

Since its introduction by Holland in 1960’s, genetic algorithms (GA’s) have mo- 
tivated several efforts to develop a formal model where they can be explained. 
Holland itself gives the first step in his book ([7]), and provides the well known 
schema theorem. Although such theorem have recently been subject of much 
critical discussion, because is based only in destructive effects of crossover and 
mutation operators. Also the framework provided by Holland is restricted to a 
particular kind of GA with proportional selection, 1-point crossover and uniform 
mutation. 

Other approaches have been proposed such as those based on statistical me- 
chanics ([13,16]), and the very successful approach based on Markov chain anal- 
ysis ([12,3,4], [15]), and its derivations related with dynamical systems ([17,18]). 

In this paper a different framework is proposed for the analysis of GA’s. 
As a first step towards a comprehensive approach, a GA with selection only is 
analyzed. The model presented here is more general than others because no par- 
ticular kind of selection is not assumed. Such operator is described as generally 
as possible. 

The theoretical framework is based on a partition of the populations set and 
the definition of a metric in such space. A partial order is defined on the set of 
partitions in order to prove some kind of convergence, if the GA operates only 
by iterated application of selection. 

The convergence caused by selection has been analyzed in the past [10]. In 
this work a new approach is introduced, and is established that selection causes 
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the population of a GA tends to a populations subset characterized by the same 
set of genotypes. Furthermore the presence of global maximum in such set of 
genotypes depends on the kind of selection rather than selection itself. As is 
mentioned above the analysis does not consider a particular kind of selection. 

With the tools provided by the framework described, an analysis of general 
entropy behavior is done and is shown that, in general, selection forces leads to a 
decrease of entropy. The information theory approach has been presented before 
as in [11], but in a different way. 

In the second section some concepts and notation, that will be used in the rest 
of paper, are defined. In the third section the fundamental tools are developed, 
based on metric spaces and the partition of the populations set. In the fourth 
section the analysis of selection is done by means of tools related with metric 
spaces. In section five the behavior of entropy is analyzed, and finally some 
conclusions and pointers to future research are exposed. 

2 Basic Definitions 

In this paper are used genetic algorithms whose individuals are binary strings, 
or equivalently integers encoded in binary, of length Every individual in a 
population is in = {0, 1, . . . , 2^ — 1}. 

A genetic algorithm operates over populations with a finite number of those 
strings. 

Definition 1. The number of binary strings contained in a population is the 
population size. 

Note that the previous definition does not consider the number of different 
strings, only the number of strings, repeated or not, in the population. In order 
to count the number of different structures contained in the population the 
following definition is used: 

Definition 2. The number of different binary strings contained at least once 
in a population P is the number of genotypes or the eardinality of population 
denoted by C{P). Note that C{P) < 2^ 

In the following N denotes the size of a finite population of binary strings. 
The set of all possible populations of size N of binary strings of length i is 
denoted by IPiv(®£) or in short 

Any population with a finite number of individuals can be described by means 
of the proportion of every binary string in the population. 

Definition 3. Let fi be the number of times (frequency) that i G B^ appears in a 
population P G P is described by a vector P = {po^pi ^ . . . ,P 2 ^-i) ^ ^ 

where pi = fi/N. P is the proportions veetor. 

The degree of variation in a population is then described by means of the 
following: 
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Definition 4. The variation vector of a population is V(P) = (5 q, • • • , ^ 2 ^- 1 ) 

where: 

f 1 if Pi >0 

\0 ifpi=0 

The variation vector has an entry i equal to 1 if and only if the genotype i 
is represented by at least one instance in the population. Note that the set of 
all the variation vectors is equivalent to the set of integers {0, 1, . . . , 2^ ~ 1} 
encoded in binary. 

Definition 5. The variation coefficient of the population P, denoted by u(P) is 
the number of ones in V(P). Note that this is the number of different genotypes 
represented in P by at least one string, that is C(P). 

Definition 6. Let P and Q be two populations in The separation coeffi- 

cient between P and Q, denoted by iL(P, Q) is the number of positions where 
V(P) and V(Q) have different values. 0 < i/(P, Q) < 2^. 

Note that TT(P, Q) is the traditional Hamming distance as is defined in [6]. 
In what follows n = 2^. 

3 The Populations Set Partition 

Now is needed to define a partition in the space in order to do it, is defined 
a equivalence relation as appears in [14] (chapter 2) for example. 

Theorem 1. iL(P, Q) = 0 defines an equivalence relation in Pw,^ 

Proof: P is related with Q (denoted by P ~ Q) if and only if iL(P, Q) = 0. It 
is needed to prove the relation is refiexive, symmetric and transitive: 

— P ^ P, because the number of positions where V(P) differs from V(P) is 
zero, then iL(P, P) = 0 

— if P ^ Q then the number of positions where V(P) differs from V(Q) is 
zero then iL(P, Q) = i^(Q, P) = 0 then Q ^ P 

— if P ^ Q and Q ^ R then H{P,Q) = H{Q,K) = 0, the number of positions 
where V(P) is different from V(Q) and the number of positions where V(Q) 
and V(R) are different is zero, then V(P) = V(Q) = V(R), then iL(P, R) = 
0, that is P ^ R 



□ 

The relation defined above induces a partition of the space where it is de- 
fined, Pat,^. Every equivalence class is identified by means of the variation vector 
of any population in the class, given that such variation vector is the same for 
every population in that class. Then definition 4 can be extended to the set of 
classes and talk about the variation vector of a class rather than the variation 
vector of a particular population only. In what follows denotes the set of all 
the populations classes of length £ strings. 
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Definition 7. Let i 2 be the nonnegative integer i encoded in binary. The class 
Ci G is: 

0 = {P e Piv,£|V(P) = (6o, hi,..., hn-i) with i 2 = h^hi . . . hn-i} 

And 

Y{C,) = {hn,hi,...,hn-l) 
is the variation vector of the class Ci. 

Definition 8. The number of ones in the variation vector of the class Ci G 
is the variation coefficient of Ci denoted by v{Ci) 

So, two populations P and Q are in the same equivalence class if and only 
if they have the same genotypes. Q denotes the class of all the populations 
whose variation vector corresponds to the integer i encoded in binary. H will be 
redefined. 

Definition 9. H{Ci,Cj) is the number of positions where V(Q) and V(Cj) 
have different values. 

Theorem 2. H is a metric for Q. 

Proof: Let Ci and Cj be two classes of populations, H is the Hamming distance 
between two class variation vectors V(Ci) and V(Cj), and iL is a true metric 
as proved in [1], then iL is a metric for C^. □ 

Now is defined a Cauchy sequence in the metric space (C^,iL) as appears 
in [14] and [5] (chapter 3). 

Definition 10. A sequence {Cy in Q is a Cauchy sequence if and only if for 
any real number ^ > 0 there exists a nonnegative integer N such that for every 
k,m > N: 

The previous definition provides the elements needed to demonstrate the 
following theorem. The definition of complete metric space can be found in [ 14] 
and [5] (chapter 3). 

Theorem 3. (C^, H) is a complete and hounded metric space. 

Proof: In order to demonstrate completeness is needed to show that any Cauchy 
sequence in converges. 

Let {Cy be a Cauchy sequence, then for any s > 0 there exists an integer N 
such that for every k,m > N: 






In particular for 



£ = 1 
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Then exists N such that for every k,m > N: 

H{C^,C^) < 1 

This implies that H{C^ ^ = 0 because this is the only possible value for H 

less than 1, then and C'^ are the same string. And this string is in Q. 

H can only take values in {0, 1, . . . , 2^} then is a bounded metric. □ 

4 Convergence of Generic Selection 

In this section is shown that a genetic algorithm, which operates only by selection 
(i.e. the probabilities of mutation and crossover are both zero), converges. Similar 
result has been demonstrated before [10], with the use of other techniques. 

The algorithm converges to a class of populations rather than to a particular 
population. Every generation of a GA belongs to a class, looking the evolution 
of the algorithm from the classes point of view, it is possible to see that such 
evolution points to some particular class. 

The following defines a partial order relation ([9,8]) in C^. 

Definition 11. Let Ci and Cj be two populations in Q. Ci is a reduction of Cj 
{Ci :< Cj) if and only if in every position where V(C^) has 1, YiCj) also has 1. 
This is, any class Ci is a reduction of other class Cj if and only if all the genotypes 
represented in Ci are also represented in Cj. 

With the previous definition Q constitutes a partially ordered set (poset), 
in fact the set is a lattice (any two classes are reduction of some other class and 
a class can be found that is reduction of both classes) , the supremum element is 
(!,...,!) and infimum element is (0, . . . , 0). An alternative lattice definition can 
be found in [9] (ch. 1). A poset (T, <) is a lattice if and only if the supremum 
and infimum exist for any finite nonvoid subset S C L. 

Using the previous demonstrated theorem and the lattice definition men- 
tioned above it is possible to prove the following theorem. 

Theorem 4. Let {Cy he an infinite sequence of elements in Q such that 

Ci ^ c^-i 

(monotonically decreasing) , then {Cy converges to some class / G where 

I = inf{cy 

Proof: Ci has exactly 2^ elements, then {Cy has a finite number of different 
terms i? C C^. By the lattice definition i? has an infimum element I such that: 

i^c ycef? (1) 

Let M be the smaller integer such that C^ = I. Given that {Cy is mono- 
tonically decreasing then for every k > M occurs C^ = I. So for every £: > 0 
there exists a. N = M such that for every k,r > N 

H{c^,cy = o<s 
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So {Cy is a Cauchy sequence and converges for theorem 3. 



Also for every £: > 0 if r > M then 






so {Cy converges to I. 



□ 



The meaning of selection can be formulated in general terms as follows. 

Definition 12. Let G be a population in a class Cjt G C^. Selection is 
a function S : ^ '^N,i such that: 



where P^+i G Cr and Cr ^ 

That is: selection can not increase the number of genotypes represented in any 
population. The result of applying selection to any population is other population 
with, at most, the same set of genotypes. 

Definition 13. Let ^ be a function that assign to every popula- 

tion P the class where P lives. That is, if P G then ^(P) = 

The iterated application of selection in any population induces a monotonous 
decreasing sequence (as defined in [14], chapter 3) of the classes where such 
populations are. Given that the set of classes with iL is a complete metric space, 
such sequence converges in C^. S^{P) denotes the k-th. iterated application of 
selection to the population P. Now the set of tools needed to prove following 
theorem is complete, analogous to the respective theorem for M (theo. 3.14 [14]): 

Theorem 5. Let Pq G Pw,^ be an initial population. The sequence: 



5 Entropy Analysis 

With the convergence result proved it is possible to analyze the behavior of 
entropy in the generic evolution of populations. 

Definition 14. Let P = • • • iPn-i) be a population in Pw,£- The entropy 

of P, denoted by HiP) is: 



^(P,) = P,+i 



. . . ^(^'^(Po)) ^ ^(^'(Po)) =< ^(^(Po)) ^ ^(Po) 



converges to some class C G C^. 
Proof: By theorem 4 this is true. 



□ 
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The entropy of a population with only one genotype is 0 because such geno- 
type (z) has Pi = 1 and pj = 0 for every j ^ i. 

In a population of size with k < n equally probable genotypes, the prob- 
ability of each genotype is: 



Pi 



K 1 

k _ ^ 



N k 

The entropy of a population with this condition is maximum: 



HiP) = k - log2 (fc) = log2 (fc) 



( 2 ) 



Then the greatest entropy of any population in is i because the total 
amount of possible genotypes is 2^. In summary: 

0 < n{P) <i V p G 



Theorem 6. Let P G be a population whose variation coefficient is v(P) 
then: 



n{P) < log 2 (t;(P)) 



Proof: The entropy is maximum when all the genotypes are equally probable, 
applying the equation 2 with k = v(P) the desired result is obtained. □ 

It is useful to do an extension of the definition of population entropy to define 
the entropy of a class in C^. 

Definition 15. Let C G be a class of populations, the entropy of C is defined 
as: 

H{C) = max {H{P) \ P G C} 

With this definition and theorem 6 the following corollary can be formulated: 
Corollary 1. 1~L(C) = log 2 {v{C)) 

Proof: 

n{C) = max {n(P) I P G C} 

Every population P G C has the same variation coefficient. Then, by theorem 6: 
H{C) = log2 (^(P)) V P G C 



and the variation coefficient of every population in C is the variation coefficient 
of C itself. □ 
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Corollary 2. Let P, Q G 'Pn,£ be two populations such that: 

S{P) = Q 

then: 

Him)) < ^(^(p)) 



Proof: Since (def. 12): 

^(Q) r< ^(P) 

then the number of ones in V(Q) is less or equal that the number of ones of 
V(P) and: 

log2 (u(^(Q))) < log2 {vmP))) 

By the previous corollary: 



and 



So: 



mm)) = iog2 (^(^(Q))) 
mm)) = iog2 (t^(^(p))) 

mm)) < mm)) 



□ 



Now the following theorem can be formulated: 

Theorem 7. Let Pq G be an initial population. The sequence: 

{mm\^m)} 

is monotonically decreasing, bounded and convergent. 

Proof: The monotony and decreasing features are obvious by the corollary 2. 
The lower bound is 0 and the upper bound is the entropy of the class where the 
initial population is. This sequence is in R, that is a complete metric space, thus 
it converges to some real number. □ 



6 Conclusions and Future Research 

The framework shown here is a new and useful approach for the analysis of 
GA’s. With the tools provided by such a framework some features of selection 
have been proved. Also the analysis is more general that those reached by the 
use of some other methods. 

The essential feature of selection has been isolated, the feature that is present 
in all the different kinds of this operator: selection can not increase the number 
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of different genotypes present in a population. This is the sufficient condition 
that assures the convergence of a GA with selection only. 

Also the behavior of entropy has been analyzed along the iterated application 
of selection to an initial population. It can be concluded that, from a global point 
of view, selection reduces entropy. Entropy can increase from one generation to 
the next but, in the long term, entropy will decrease and converge to a limit closer 
to zero as small is the number of different genotypes present in populations. 

To demonstrate the usefulness of this framework it remains to apply it to 
all the other common genetic operators: mutation and crossover. Currently only 
some intuitive features have been proved. But is possible to speculate that if 
this method is applied to mutation, for example, the result will be that, in 
the limit, the sequence of classes where the generations are, tends to be in a 
neighborhood of some other class, the radius of such neighborhood will be greater 
proportionally to the probability of mutation and will also be affected by the 
distribution of genotypes present in the population. 

It could be interesting to consider the mutation operator as noise in a channel. 
A population P before mutation will be transmitted through a noisy channel to 
another population P' and some of the symbols present in P will be modified to 
others that were originally not present. 

Also is interesting the application of the method to the analysis of some 
particular kinds of selection as are presented in [2]. 
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Abstract. A new method to automatically identify a human face onto 
a 2D gray level image is presented. The method uses an invariant de- 
scription of the face and a genetic algorithm to accomplish the task. The 
features used are the first four translation, rotation and scale moment 
invariants proposed by Hu [1]. In a first step, an image possibly contain- 
ing a face is first divided into small cells of fixed size of 5 x 5 pixels. 
For each cell, the ordinary moments are next computed. From these, the 
corresponding Hu’s invariants are then derived. Human face identifica- 
tion is thus accomplished by grouping individual cells using a genetic 
algorithm by fitting a specific cost function. This cost function corre- 
sponds to the invariant description of a human face given in terms of the 
detected image features. 



1 Introduction 

The human face is a complex pattern. Finding human faces automatically in 
a scene is a difficult yet significant problem. In recent years this problem has 
attracted considerable attention [2], [3], [4], [5], [6], [7] and [8] although it still 
remains an open problem [9] and [10]. It is the first important step in a fully 
automatic human recognition system [4] . The solution to this problem is impor- 
tant in many applications such as videoconferencing, multimedia and internet 
video communication. It would be thus desirable to count with a methodology 
that allows the location of a face in an automatic way. 
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The technique here proposed consist on doing segmentation and recognition 
at the same time as follows. Given an input image possibly containing a face: 1) 
divide this image into small cells of size of 5 x 5 pixels, 2) compute the ordinary 
moments (those used in mechanics) for each cell, 3) from these moments, derive 
the corresponding Hu’s invariants [1], and 4) accomplish human face detection 
by grouping individual cells by using a genetic algorithm by fitting a specific 
cost function. This cost function corresponds to the invariant description of a 
human face in terms of the detected image features. 

In the solution of this problem, in this work, the next four suppositions are 
assumed: 

1. The number of human faces in the image is unknown. 

2. Their size, is also unknown, and 

3. Their location and orientation are unknown too. 

4. Small pan and tilt rotations and changes of facial expressions are also al- 



The remaining of the paper is organized as follows. In section 2 each one of 
the steps composing the proposed methodology are described. In section 3 the 
features used to describe a face are presented. The guided searching algorithm 
and the technique to swap the space of solutions are given in sections 4 and 5, 
respectively. Some experimental results with the aim to test the performance of 
the proposed methodology are next presented in section 6. Some conclusions and 
directions for future research are finally given in section 7. 



The proposed methodology to automatically identify a human face in a 2D image 
has two main steps: human face description and human face identification. Each 
of these two steps is next described. 

2.1 Human Face Description 

The human face can be considered as a pattern. As such, it can be described as 
a vector in terms of a set of invariant features as follows: 



lowed. 



2 The Methodology 



Oi = [^1 V’2 , V’k] 



( 1 ) 



with 



V'fc = fk {Oi) ,l <i <n,l <k < K 



( 2 ) 



where 
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Oi 



IS 



face number i 



'tpk 



IS 



feature number k 



„ X is the generator function 
^ ^ of feature number k 



n 



is the number of faces 



K is the number of features 



To obtain O for each person a CCD camera is placed in front of the subject 
trying to keep him/her always into the visual field of the sensor. In each case, 
the corresponding face is next manually segmented. For each segmented region 
the fjkS are next computed (see section 3). A feature vector is finally obtained in 
terms of the describing features. A face is thus represented by a unique feature 
vector, Oi^i = 1, ..., n. 

2.2 Human Face Identification 

Using the feature vectors obtained in the last section a references, the identifi- 
cation process can be started. It consists of two steps as follows: 

1. Confidence Function Definition. Define a function involving /fc, C (fk) to 
guide the searching process of a face (its description) into the representation 
space E. For a given image, E is composed as the union of all possible values 
of /fc, this is: 



where R is the total number of disjoint pixel regions in the image. 

2. Searching process. Swap the representation space E by means of a guided 
searching process until obtaining the best approximation of fk by merging 
and by split ing fixed sized regions in the image. This will be done by a 
genetic algorithm as explained in section 5. The final result will be a region 
enclosing the object of interest (a face in this case). 

3 Invariant Descriptors Used 

Lots of features have been used in the past to derived the model of an object. 
The features actually used in this work are those proposed by Hu [1]. These 
features have been chosen because they have proven to be very nice features in 
shape recognition [1], [11], [12], and [13]. Only the first four are used. According 
to the proposed methodology in section 2 a face can thus be described as follows: 



R 




k = l,2,...,iF 



(3) 



r=l 
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Oi = [(j)l (j)2 (p3 ] (4) 

To be used in the content of this work, these invariant features must be 
described in terms of the ordinary moments and not in terms of the central 
moments. For example, first Hu’s invariant can be expressed as: 



01 



(m2o + rno2) • moo - 



m 



3 

00 



(wfo + mgi) 



(5) 



The reason to do this is because the ordinary moments can be directly added 
because they are referred to the same point (the origin of the coordinate frame); 
the central moments are referred to different points (the centroids of each cell). 
If we would like to use the central moments of each cell to obtain the invariant 
moments of bigger regions, it would be thus necessary to first recompute the 
centroids of each cell and then obtain the invariant moments for these new 
centroids. 

Thus, if a region R of interest is divided into n subregions, Ri,i = 1, ...,n, 
the standard moments of R can be obtained as the summation of the standard 
moments of each subregion as: 



n 

TTipg (R) = y; nipg (Ri) (6) 

For each subregion, the standard moments are computed by means of the 
following expression [1]: 



rripg {Ri) = y] y] a;V/ {x, y) (7) 

{x,y) ERi 

Let’s suppose now that the entire image is divided into cells of the same 
size by using a fixed reticule as shown in figure 1 (a), and that for each cell 
the standard moments are computed using 7. An object in the image, when the 
reticule is applied, can be thus seen as subdivided in regions as shown in figure 1 
(b). This way, the same object a different sizes, will include more or less cells. 

The identification (segmentation) problem can thus be seen as an optimiza- 
tion problem: find the set of contiguous cells best fitting the value of a 
descriptive function. In our case, the descriptive function is a feature vector 
whose components are the first four Hu’s invariants as shown in equation 4. 

4 Guided Searching Algorithm 

Now, if Ointerest = [ 01 02 03 04 ] corresponds to the description of a face desired 
to be verified into an image, then the problem is reduced to find the set of cells 
whose image region invariant descriptors [ cj)-^ (j )2 0s 04 ] obtained by combining 
the standard moments of each cell according to 5, 6 and 7 best fit the feature 
vector describing Ointerest- 
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(a) (b) 

Fig. 1. (a) An image divided into cells of the same size by using a fixed reticule, 
(b) The object of interest subdivided into cells. 



The cell grouping process is done by means of a genetic algorithm which will 
allows us to swap the representation space by using a confidence function C {fk). 
It is next described. 



5 Genetic Algorithm 

A genetic algorithm (GA) [14] has been chosen because as it was shown in 
[15, 16,1 "^] 7 GAs have proven to be very useful tools in many problems including 
pattern recognition. The main features of the genetic algorithm used [ 14] are the 
following: 



— A chromosome equals the grouping of a set of contiguous cells. 

— The initial population contains only individual cells. 

— New generations are obtained by recombining new and old populations. 

— The crossing and mutation probabilities are fixed to be 70% and 5%, respec- 
tively. 

— The size population is fixed to be 100. 

— Multiple crossing point are used. 

— The fitness function of each chromosome is obtained as: 



fit = 



4>i - 4>d 
4>i + 



■// 



where 
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fit isthef lines so fthechromosome 

, arethevaluesoftheinvariants 
^ oftheinterestface 

arethevaluesoftheinvariants 
(pd oftheimageregionspecified 
byachromosome 

isashapef actor definedas : 

// = max 

, isthesizeo fthema j orprincipalaxe 

oftheface 



. isthesizeo ftheminor principal axe 

oftheface 

— The confidence or cost function to be minimized is defined as: 



C{fk) 



Pi - 4>d 

Pi + Pd 



where pi^ pd and / are, respectively, the invariants searched for, the region 
invariants and the value of the function to be optimized. 



The GA starts by generating a initial population containing only individual 
cells. Each chromosome represents thus an individual cell. The next population is 
obtained by crossing and/or mutating the chromosomes of the initial population. 
This population will contain chromosomes corresponding to regions composed 
of two or more cells. This process is repeated until the best minimization of the 
cost function is obtained. 

A complete description of the GA algorithm is fully given in [14]. 



6 Experimental Results 

In this section some experimental results are shown. To test the proposed method- 
ology the set of faces shown in figure 2 was used. Each image is of 200 x 200 
pixels and has 16 gray levels. 



6.1 Human Face Description 

Eor each person shown in figure 2 an image called reference image was taken. 
Each reference image was obtained by placing a CCD camera in front of the 
subject trying to keep him/her always into the visual field of the sensor. 



Human Face Identification Using Invariant Descriptions 299 



11 








9 




[fij 


k 




13 


LflJ 


\Sm 



Fig. 2. The set of images used. 



Each face was next segmented as explained in section 2, the corresponding 
describing vectors were thus obtained as explained also in section 2. With these 
reference vectors the performance of the genetic algorithm as a face detector and 
recognizer was tested. The results are next shown. 



6.2 Face Detection and Recognition 

The performance of the GA was tested with a set of 300 images: 19 for each of 
the twelve people (228) and 72 containing other things than people. The first 228 
images were obtained by placing again a CCD camera in front of the subject at 
different distances with regard to him trying to keep him always into the visual 
field of the sensor. Again, he or she was asked to rotate his/her face a little and 
to change his facial expression to obtain different samples 

The experiment consisted in selecting at random from the 300 test images 50 
trying to find in them each one of the 12 faces previously described. The search 
was individual, one face at the time. 600 tests were thus performed. 

During testing, the GA satisfactorily found in 71% the desired face, in 21%, 
it failed, i.e. the GA identified a subimage not containing a face as the face, and 
in 8%, it gave false positives (the GA said that the face found in the image was 
that looked for when in reality it was not present in the image). Figure 3 shows 
some examples of the segmentation results obtained during the experiments. The 
results are summarized in table 1. For this table: 
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Fig. 3. Segmentation results. 



— % of false positives is the percentage of cases in which the segmented face 
does not correspond to the desired one. 

— % of lost faces is the percentage of cases in which the desired face was not 
found when it was really present in the image. 

— % of found faces is the percentage of cases in which the desired face was 
satisfactorily found. 



7 Conclusions and Future Research 

A technique for the automatic identification of a human face onto a 2D gray 
level image was presented. The technique uses an invariant description of the 
face and a genetic algorithm to accomplish this task. The invariants used are the 
first four feature invariants proposed by Hu [1]. In a first step, an image possibly 
containing a face is first divided into small cells sized in 5 x 5 pixels. In a second 
step, for each cell, the ordinary moments are next computed. From these, the 
corresponding Hu’s invariants are then derived. Human face segmentation and 
identification was thus accomplished by grouping individual cells using a genetic 
algorithm by fitting a specific cost function. This cost function corresponds to 
the invariant description of a human face. 

The use of the proposed methodology to segment a face avoids the application 
of an exhaustive process over each image point. In the case of an exhaustive 
process, for each image point, windows of different sizes have to be evaluated. 
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Eace No. 


% found faces 


% lost faces 


% false positives 


1 


76 


18 


6 


2 


78 


15 


7 


3 


66 


25 


9 


4 


72 


18 


10 


5 


82 


10 


8 


6 


65 


26 


9 


7 


62 


31 


7 


8 


66 


24 


10 


9 


62 


31 


7 


10 


69 


24 


7 


11 


78 


13 


9 


12 


76 


18 


6 



Table 1. Experimental rsualts in terms of the search efficiency of the GA. 



and for each one of these windows the specified vector of descriptors has to be 
computed and compared with the interest model vector. Instead, the GA swaps 
the entire image only by considering interest regions guided by a confidence 
function. 

We are planning to combine our method with other techniques to do coop- 
erative object recognition. The goal is to put a set of different face detectors to 
work onto the same image, take their opinion about the image and fusion their 
results into a unique module that will give the final result. The aim, of course, 
is to see if the combination of several detectors give a better performance. 
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Abstract. In this paper, a new multiobjective optimization technique 
based on the genetic algorithm (GA) is introduced. This method is based 
in the concept of min-max optimum, taken from the Operations Research 
literature, and can produce the Pareto set and the best trade-off among 
the objectives. The results produced by this approach are compared to 
those produced with other mathematical programming techniques and 
GA-based approaches using a multiobjective optimization tool called 
MOSES (Multiobjective Optimization of Systems in the Engineering Sci- 
ences). The importance of representation is hinted in the example used, 
since it can be seen that reducing the chromosomic length of an individ- 
ual tends to produce better results in the optimization process, even if 
it’s at the expense of a higher cardinality alphabet. 



1 Introduction 

Engineering optimization has been a very fertile area of research in the last few 
years, but the normal trend has been to deal with a single objective at a time, and 
use ideal and unrealistic problems, rather than real-world applications. Assuming 
only one objective is generally unrealistic for engineering optimization problems, 
since most real-world problems have several (possibly conflicting) objectives. The 
common practice, therefore, has been to let the designer to make decisions based 
on his/her experience, instead of using some well-defined optimality criterion. 

Over the years, the Operations Research community has produced more 
than 20 mathematical programming techniques to deal with multiple objectives. 
However, the main focus of these approaches is to produce a single trade-off based 
on some notion of optimality, rather than producing several possible alternatives 
from which the designer may choose. More recently, the genetic algorithm (GA), 
an artificial intelligence search technique based on the mechanics of natural se- 
lection, has been found to be effective on some scalar optimization problems. In 

Part of this work was developed by the author while visiting LANIA (Laboratorio 
Nacional de Informatica Avanzada) in Xalapa, Veracruz, Mexico 
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order to extend the GA to deal with multiple objectives, the structure of the 
GA has been modified to handle a vector fitness function. 

This paper will review some of the previous work in multiobjective optimiza- 
tion using GAs, and a new approach, proposed by the author, will be introduced. 
Also, MOSES (Multiobjective Optimization of Systems in the Engineering Sci- 
ences), a system developed as a testbed for multiobjective optimization tech- 
niques by the author, will be briefiy described together with an example of its 
use. The new approach, based on the notion of min-max optimum, is able to 
generate the Pareto set and better trade-offs than any of the other techniques 
included in MOSES. The importance of using alphabets of cardinality higher 
than two will be emphasized, and the results found with this alternative rep- 
resentation will be shown to be better than those produced using a traditional 
binary representation, both for single and multiobjective optimization. 

1.1 Statement of the Problem 

Multiobjective optimization (also called multicriteria optimization, multiperfor- 
mance or vector optimization) can be defined as the problem of finding [13]: 

a vector of decision variables which satisfies constraints and optimizes a 
vector function whose elements represent the objective functions. These 
functions form a mathematical description of performance criteria which 
are usually in conflict with each other. Hence, the term “optimize” means 
finding such a solution which would give the values of all the objective 
functions acceptable to the designer. 

Eormally, we can state it as follows: 

Eind the vector x* = [x^, • • • 5 which will satisfy the m inequality 

constraints: 



9 i{x) >0 z = 1,2, . . . ,m 



( 1 ) 



the p equality constraints 



hi{x) = 0 i = l,2,...,p 
and optimize the vector function 



( 2 ) 



f{x) = [fl{x), f 2 {x), fk{x)f 
where x = [xi, X 2 , . . . , Xn]^ is the vector of decision variables. 



( 3 ) 



1.2 Min-Max Optimum 

The idea of stating the min-max optimum and applying it to multiobjective 
optimization problems, was taken from game theory, which deals with solving 
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conflicting situations. The min-max approach to a linear model was proposed by 
Jutler and Solich and was been further developed by Osyczka [11], Rao [14] and 
Tseng & Lu [18]. 

The min-max optimum compares relative deviations from the separately at- 
tainable minima. Consider the ith objective function for which the relative de- 
viation can be calculated from 



It should be clear that for (4) and (5) we have to assume that for every z G / 
and for every x G X, fi{x) ^ 0. 

If all the objective functions are going to be minimized, then equation (4) 
defines function relative increments, whereas if all of them are going to be max- 
imized, it defines relative decrements. Equation (5) works conversely. 

2 Multiobjective Optimization Using GAs 

Some of the most important GA-based multiobjective optimization techniques 
will be briefly explained in this section. 



David Schaffer [16] extended Grefenstette’s GENESIS program [8] to include 
multiple objective functions. Schaffer’s approach was to use an extension of the 
Simple Genetic Algorithm (SGA) that he called the Vector Evaluated Genetic 
Algorithm (VEGA), and that differed of the first only in the way in which se- 
lection was performed. This operator was modified so that at each generation 
a number of sub-populations was generated by performing proportional selec- 
tion according to each objective function in turn. Thus, for a problem with k 
objectives, k sub-populations of size N/k each would be generated, assuming a 
total population size of N. These sub-populations would be shuffled together to 
obtain a new population of size V, on which the GA would apply the crossover 
and mutation operators in the usual way. Schaffer realized that the solutions 
generated by his system were non-inferior in a local sense, because their non- 
inferiority is limited to the current population, and while a locally dominated 
individual is also globally dominated, the converse is not necessarily true [16]. 




(4) 



or from 




(5) 



3 VEGA 
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4 Lexicographic Ordering 

The basic idea of this technique is that the designer ranks the objectives in order 
of importance. The optimum solution is then found by minimizing the objective 
functions, starting with the most important one and proceeding according to 
the order of importance of the objectives [15]. Fourman [6] suggested a selec- 
tion scheme based on lexicographic ordering. In a first version of his algorithm, 
objectives were assigned different priorities by the user and each pair of individ- 
uals were compared according to the objective with the highest priority. If this 
resulted in a tie, the objective with the second highest priority was used, and 
so on. A second version of this algorithm, reported to work surprisingly well, 
consisted of randomly selecting the objective to be used in each comparison. 
As in VEGA, this corresponds to averaging fitness across fitness components, 
each component being weighted by the probability of each objective being cho- 
sen to decide each tournament [5]. However, the use of pairwise comparisons 
makes an important difference with respect to VEGA, since in this case scale 
information is ignored. Therefore, the population may be able to see as convex 
a concave trade-off surface, depending on its current distribution, and on the 
problem itself. 

5 Weighted Sum: Hajela’s Method 

Hajela and Lin [9] included the weights of each objective in the chromosome, and 
promoted their diversity in the population through fitness sharing. Their goal 
was to be able to simultaneously generate a family of Pareto optimal designs 
corresponding to different weighting coefficients in a single run of the GA. Besides 
using sharing, Hajela and Lin used a vector evaluated approach based on VEGA 
to achieve their goal. They proposed the use of a utility function of the form: 



where E* are the scaling parameters for the objective criterion, I is the num- 
ber of objective functions, and Wi are the weighting factors for each objective 
function Fi. In MOSES ’s implementation, a min-max approach was used to de- 
termine the utility function, so that the scaling factor was the ideal vector. 
Hajela’s approach also uses a sharing function of the form: 



where a = 1 for this work, dij is a metric indicative of the distance between 
designs i and j, and ash is the sharing parameter, which is typically chosen 
between O.OI and O.I. The fitness of a design i is then modified as: 




( 6 ) 
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(8) 



where M is the number of designs located in vicinity of the Uth design. 
Hajela incorporates weight combinations into the chromosomic string, and 
under his representation, a single number represents not the weight itself, but 
a combination of them. For example, the number 4 (under floating point rep- 
resentation) could represent the vector = (0.4, 0.6) for a problem with two 
objective functions. Then, sharing is done on the weights. Finally, a mating re- 
striction mechanism was imposed, to avoid members within a radius a mat to 
cross. 



6 MOGA 

Fonseca and Fleming [4] have proposed a scheme in which the rank of a certain 
individual corresponds to the number of chromosomes in the current population 
by which it is dominated. Consider, for example, an individual Xi at generation t, 
which is dominated by individuals in the current generation. Its current 
position in the individuals’ rank can be given by [4]: 

rank{xi, t) = 1 + pf ^ (9) 

All non-dominated individuals are assigned rank 1, while dominated ones are 
penalized according to the population density of the corresponding region of the 
trade-off surface. See Fonseca and Fleming [4] for details. 



7 NSGA 

The Non-dominated Sorting Genetic Algorithm (NSGA) was proposed by Srini- 
vas and Deb [17], and is based on several layers of classifications of the individ- 
uals. Before the selection is performed, the population is ranked on the basis 
of nondomination: all nondominated individuals are classified into one category 
(with a dummy fitness vaiue, which is proportionai to the popuiation size, to 
provide an equai reproductive potentiai for these individual). To maintain the 
diversity of the popuiation, these ciassified individuais are shared with their 
dummy fitness vaiues. Then this group of ciassified individuais is ignored and 
another iayer of nondominated individuais is considered. The process continues 
untii aii individuais in the popuiation are ciassified. A stochastic remainder pro- 
portionate seiection was used for this approach. Since individuais in the first 
front have the maximum fitness vaiue, they aiways get more copies than the rest 
of the popuiation. This aiiows to search for nondominated regions, and resuits 
in quick convergence of the popuiation toward such regions. Sharing, by its part, 
heips to distribute it over this region. 
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8 NPGA 

Horn and Nafpliotis [10] proposed a tournament selection scheme based on 
Pareto dominance. Instead of limiting the comparison to two individuals, a num- 
ber of other individuals in the population was used to help determine dominance. 
When both competitors were either dominated or non-dominated (i.e., there was 
a tie), the result of the tournament was decided through fitness sharing [7]. The 
pseudocode for Pareto domination tournaments assuming that all of the objec- 
tives are to be maximized can be found in Horn and Nafpliotis [10]. 

9 An Approach Using a Min-Max Strategy 

The idea of this approach is to generate the individuals in such a way that they 
all constitute feasible solutions. This can be ensured by checking that none of 
the constraints is violated by the solution vector encoded by the corresponding 
chromosome, and by designing special operators. Then the user has to provide 
a vector of weights, which are used to spawn as many processes as weight com- 
binations are provided (normally this number will be reasonably small). Each 
process is really a separate GA in which the given weight combination is used in 
conjunction with a min- max approach to generate a single solution. After the n 
processes are terminated (n=number of weight combinations provided), a final 
file is generated containing the Pareto set, which is formed by picking up the 
best solution from each of the processes spawned in the previous step. Since this 
approach requires knowing the ideal vector, the user is given the opportunity to 
provide such values directly (in case he/she knows them) or to use another GA 
to generate it. 

10 Example 

To illustrate the use of MOSES and the efficiency of the new technique proposed, 
one engineering design example were selected from the literature [3]. Since it is 
generally intractable to obtain an analytical representation of the Pareto front, 
it is usually very difficult to measure the performance of a multiobjective opti- 
mization technique. Eor the purposes of this paper the results were compared 
only in terms of the best trade-offs that could be achieved. For that sake, the 
following expression was used 



k 

^p(f) = E 

2 = 1 

where k is the number of objectives, pi = /f,or /^(x), depending on which 
gives the maximum value for Lp{f)^ and wi refers to the weight assigned to 
each objective (if not known, equal weights are assigned to all the objectives). A 
sketch of the Pareto front produced by each technique can actually be obtained 
with MOSES, but due to space limitations such graphs won’t be included in this 
paper. 



fi - fi{x) 



Pi 



(10) 
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Fig. 1. Fig. 1 Sketch of the machine tool spindle used for the example. 



10.1 Design of a Machine Tool Spindle 

Consider the problem of a preliminary design of a machine tool spindle as pre- 
sented in Figure 1 (taken from Eschenauer et al. [3]). The formulation of the 
multiobjective optimization problem is to minimize fi{x) and f 2 {x) as defined 
below [3]. 



fi{x) = ^ [a{dl - 



f 2 {x) 



Fa? 

3EIa 




Ia = 0M9{dl-dt), 

1 10 

Ca = 35400|(5ro| , 



dl) + l{dl - dl)] 



F 
H 

Cn, 



(-t) 



2 ^ 



C(,/2 



Ib = 0.049(dj — dg) 



Cb = 35400 1 (5^6 1 9 



gi{x) = l-lg<0 
92{x) = h- l <0 
gs{x) = dai-da <0 



gi{x) = da~da 2<0 
gb{x) = dbi- db <0 
ge{x) = db-db 2 <0 
g7{x) = dom - do <9 



( 11 ) 

( 12 ) 

(13) 

(14) 

(15) 

(16) 



gsix) = pido - db < 0 . 

geix) = p 2 db - da < 0 

gw{x) = \Aa + {Aa - Ab)j\ - A < 0 (18) 

For this example, it is assumed that da must be chosen from the set X 3 = 
{80,85,90,95}, and dh from the set X 4 = {75,80,85,90}. Additionally, the fol- 
lowing constant parameters are assumed: 
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(io^=25.00 mm 
<^a2=95.00 mm 
6^52=90.00 mm 
P2=1.05 
/^=200.00 mm 
E = 210,000.0 N/mm2 
Aa = 0.00540000 mm 
A = 0.01000000 mm 
5rh = —0.00100000 mm 



(iai=80.00 mm 
6^51=75.00 mm 
pi=1.25 
/fe=150.00 mm 
a=80.00 mm 
F = 10,000 N 
A}) = —0.00540000 mm 
^ra = —0.00100000 mm 



11 Comparison of Results 

The ideal vector that each method generates will be compared with the best 
results reported in the literature [3]. The two Monte Carlo methods included 
in MOSES were used, together with Osyczka’s multiobjective optimization sys- 
tem [12] to obtain the ideal vector. Also, several GA-based approaches will be 
tested using the same parameters (same population size and same crossover and 
mutation rates). If niching is required, then the niche size will be computed 
according to the methodology suggested by the developers of the method. 



Method 


Xi 


X2 


X3 


X4 


fi 


f2 


Monte Carlo 1 


59.08 


189.17 


90 


75 


606667.43 


0.032467 


Monte Carlo 1 


26.26 


193.29 


90 


85 


1457744.67 


0.019242 


GA (Binary) 


60.00 


200.00 


80 


75 


466532.80 


0.038087 


GA (Binary) 


25.00 


190.09 


95 


90 


1640191.80 


0.016613 


GA (FP) 


56.16 


194.49 


95 


90 


312430.43 


0.017951 


GA (FP) 


25.35 


189.58 


95 


90 


1641135.80 


0.016615 


Literature 


63.89 


183.29 


85 


80 


531059.80 


0.030182 


Literature 


66.45 


183.36 


95 


85 


694101.00 


0.023078 



Table 1. Comparison of results computing the ideal vector of the example (de- 
sign of a machine tool spindle) . For each method the best results for optimum fi 
and /2 are shown in boldface. 



The ideal vector of this problem was computed using the two Monte Carlo 
Methods included in MOSES (generating 100 points), and a GA (with a pop- 
ulation of 100 chromosomes running during 50 generations) using binary and 
floating point representation. The corresponding results are shown in Table 1, 
including the best results reported in the literature [3]. The results for Monte 
Carlo Method 2 are the same than for Method 1. Notice that since Osyczka’s 
multiobjective optimization system is not able to handle discrete variables, no 
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results are available for the min-max method using Osyzcka’s system. The GA 
using both binary and floating point representation found the ideal vector with 
a procedure to adjust its parameters that has been described somewhere else [1] 
(the results shown are the best after 81 runs). As can be seen in the results, the 
best result for the first objective was found using floating point representation, 
and the best result for the second objective was found using binary represen- 
tation, although the difference for this second objective is not really significant 
with respect to the difference for the first objective. The use of this floating 
point representation in various single and multiobjective optimization problems 
has been found to be superior (in general) to the binary representation, mainly 
as we increase the number of variables or their respective allowable ranges [1]. In 
terms of the multiobjective optimization problem, the new technique introduced 
in this paper produces a better overall result than any of the other existing ap- 
proaches, including the mathematical programming techniques. As it turns out, 
this technique also produces the best sketch of the Pareto front and is able to 
keep it for as many generations as necessary, contrasting with the other GA- 
based techniques that either lose the front very quickly (e.g., VEGA, NSGA, 
and Hajela’s method) or aren’t able to find it at all (e.g., GALC and the Lexi- 
cographic method). The only two approaches with which the new technique can 
really compete in terms of finding the Pareto front are NPGA y MOGA, not 
only in this but in most of the other problems analyzed by the author [2]. 

12 Conclusions 

A new multiobjective optimization method based on the min-max optimization 
approach has been proposed. This approach is very robust because it transforms 
the multiobjective optimization problem into several single objective optimiza- 
tion problems, and it works very well independently of the representation scheme 
used. However, a floating point representation seems to work better for numeri- 
cal optimization applications. The main drawback of the new approach is that it 
requires the ideal vector and a set of weights to delineate the Pareto set. Never- 
theless, when the ideal vector is not known, a set of target (desirable) values for 
each objective can be provided instead. Also, finding proper weights is normally 
an easy task, since not many of them are required to get reasonably good results. 

13 Future Work 

Much additional work remains to be done, since this is a very broad area of 
research. For example, it is desirable to do more theoretical work on niches 
and population sizes for multiobjective optimization problems to verify some of 
the empirical results obtained by the author. In that sense, it is expected that 
MOSES may be useful as an experimentation tool for those interested in this 
area. To talk about convergence in this context seems a rather difficult task, since 
there is no common agreement on what optimum really means. However, if we use 
concepts from Operations Research such as the min-max optimum, it should be 
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Method 


Xi 


X2 


X3 


X4 


fi 


f2 


Lp(f) 


Ideal Vector 










312430.43 


0.01662 


0.00000 


Monte Carlo 1 


56.67 


190.22 


85 


80 


728581.78 


0.02647 


1.92555 


Monte Carlo 2 


26.26 


193.29 


90 


85 


1457744.67 


0.01924 


3.82407 


GALC (B) 


42.27 


187.83 


95 


90 


1386131.13 


0.01696 


3.45719 


GALC (FP) 


42.78 


188.01 


95 


90 


1377893.38 


0.01697 


3.43203 


Lexicographic (B) 


62.02 


200.00 


95 


85 


856072.60 


0.02184 


2.05486 


Lexicographic (FP) 


61.98 


190.91 


95 


80 


709307.00 


0.02619 


1.84682 


VEGA (B) 


54.63 


200.00 


90 


85 


987526.38 


0.02124 


2.43936 


VEGA (FP) 


54.45 


191.11 


95 


90 


1151553.50 


0.01775 


2.75405 


NSGA (B) 


65.22 


200.00 


90 


85 


708412.19 


0.02439 


1.73510 


NSGA (FP) 


62.00 


197.36 


95 


90 


985238.13 


0.01884 


2.28746 


MOGA (B) 


65.52 


200.00 


90 


85 


699786.88 


0.02453 


1.71643 


MOGA (FP) 


67.75 


189.34 


95 


90 


800608.63 


0.02011 


1.77289 


NPGA (B) 


57.92 


200.00 


90 


75 


654768.06 


0.03223 


2.03595 


NPGA (FP) 


43.53 


187.86 


95 


90 


1363536.50 


0.01701 


3.38794 


Hajela (B) 


59.87 


188.12 


95 


80 


757841.81 


0.02498 


1.92946 


Hajela (FP) 


61.19 


188.10 


95 


90 


975296.19 


0.01861 


2.24167 


GAminmax (B) 


66.99 


200.00 


90 


85 


656950.38 


0.02532 


1.62676 


GAminmax (FP) 


71.98 


188.17 


95 


90 


672894.56 


0.02169 


1.45917 



Table 2. Comparison of the best overall solution found by each one of the 
methods included in MOSES for the example given. GA-based methods were 
tried with binary (B) and floating point (FP) representations. The following 
abbreviations were used: GALC = Genetic Algorithm with a linear combination 
of objectives using scaling. In all cases, weights were assumed equal to 0.5 (equal 
weight for every objective). 



possible to develop such a theory of convergence for these kinds of problems. Also, 
it is highly desirable to be able to find more ways of incorporating knowledge 
about the domain into the GA, as long as it can be automatically assimilated 
by the algorithm during its execution and does not have to be provided by the 
user (to preserve its generality). 
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Abstract. This paper describes a new approach to the generation of 
good solutions to the TSP using evolution programs. A novel genetic 
crossover operator is introduced, which generates a single offspring 
without explicitly preserving particular characteristics like position, 
order or adjacency. A geometrical explanation of the heuristic lies on 
the fact that a good TSP tour does not have crossing edges (or knots). 
The crossover technique aims to untie the knots and consequently, the 
evolutionary search is executed over populations of lesser knotted 
solutions. This operator, that we have called knot-cracker, always 
produces legal offspring’s avoiding repairing techniques. Experiments 
were performed over the same benchmarks used by the authors of the 
Enhanced Edge Recombination operator (EER) in their seminal 
publication. An enhanced version of the knot-cracker is also proposed, 
which improves previous results, reaches high quality solutions with 
high consistency and shows a better performance than the EER. 



1 Introduction 

The TSP is a problem in combinatorial optimization that involves finding the shortest 
Hamiltonian cycle in a complete graph of n nodes. It can be simple stated as follows; 
a traveling salesman must plan his itinerary to visit each of n cities exactly once and 
returning to the starting point, minimizing the total cost of the tour. In its Euclidean 
version the total cost of a solution is the length of the route measured with the 
Euclidean distance. The TSP is a classical example of a NP-hard problem. For 
problems of this class, it is theoretically possible to enumerate all permutations, 
evaluating each with respect to the stated objective and retaining the optimal. For not 
small n this exhaustive search procedure will fail because of time requirements: the 
number of combinations grows exponentially with n. As alternative resolution 
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methods many heuristics, inspired in different fields, have been proposed. Some of 
them rely on analogies to natural processes, such as neural networks, ant colonies, 
simulated annealing and genetic algorithms. This paper focuses on the genetic 
algorithm (GA) approach, an application that has been of modest success [5]. 

The GA community interested in the TSP has emphasized the importance of 
designing efficient genetic operators, particularly crossover operators, strongly 
dependent on the genetic representation of the solutions. The canonical binary 
codification of chromosomes has not been well suited for sequencing problems [3], 
thus special representations had been proposed. One of the most natural and widely 
used is the path representation, where an n-tuple of integers represents the visiting 
sequence of the cities, the cycle is closed by linking the last city visited to the first. 
For this representation several crossover operators have been developed, they 
generally emphasize the inheritance of at least one of three typical characteristics in 
this kind of problems: position, order and adjacency. T. Stackweather et al. [4] made a 
performance comparison between six genetic sequencing operators to an instance of 
the TSP and to a scheduling problem. Their results indicate that the effectiveness of 
different operators is dependent on the problem domain. For the TSP they found that 
the EER proposed by D. Whitley et al. [5], that explicitly preserves mostly of the 
adjacency information of one of the parents, had the best performance. They conclude 
that the key difference between the operators’ performance is the kind of information 
that each attempts to preserve during recombination and, for the TSP, the important 
information would seem to be the adjacency information [5]. In this paper we 
introduce a new crossover operator for the TSP that was designed with no preserving 
idea in mind. 

The organization of this paper is as follows. In section 2 we explain the basics 
behind the proposed operator. In section 3 the workings of the operator is described. 
Section 4 explains the implementation and experimentation process and shows the 
results obtained with a first version of the knot-cracker over the 30 cities Oliver’s 
problem [5]. Section 5 describes an enhancement of the operator and the results 
obtained over the complete benchmark [5]. Finally, there is a concluding section. 



2 The Basic Ideas Behind the Operator 

There are two basic ideas behind the proposed operator, one of them can be 
geometrically explained and the other is based on a physical similarity established by 
F. Marin [4]. Figure 1 illustrates the former; a global improvement in the cost of a 
TSP tour can be attained by undoing the crossing of two edges. The result is shown in 
the figure with dashed lines. 

The cross-product of the position vector of two consecutive cities can be used in an 
indirect way, to establish the existence of crossing edges (or knots for simple) in a 
two-dimensional TSP. As shown in figure 2, the transition from one city a to another 
city b, can be seen as a rotation of the position vectors of the corresponding cities 
around the reference point k. The sense of this rotation is indicated by the sign of the 
cross-product of the position vectors. Thus, a sign change of this cross-product when 
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considering successive transitions can be a hint of knot existence. The cross-product 
of the position vectors in the transition from a to b (see figure 2) has a positive sign, 
indicating a counterclockwise rotation around the reference point k. Whereas the 
cross-product of the position vectors in the transition from b to c has a negative sign 
indicating now a clockwise rotation. This successive change of sign of the cross- 
product is indicative of the presence of knots. It is important to note that not every 
change of sign of the cross-product necessarily means that there is a knot present. 
This can be seen in figure 3 . 




Fig. 1. The dashed lines show the effeet of undoing the crossing of two edges on the total 
length of a TSP tour. 




Fig. 2. A successive change of sign of the cross-product can be indicative of the presence of 
knots. 




Fig. 3. Not every change of sign of successive cross-product means there is a knot present. 

On the other hand, there is a physical motivation for the proposed operator. In 1996 
F. Marin [2] introduced an order parameter for the TSP by making an analogy with a 
classical particle traveling around a closed trajectory. The order parameter proposed 
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by Marin is proportional to the sum of the eross-products of the position veetors of the 
eities and is not dependent on the origin of coordinates. He solved, applying the 
simulated annealing technique, several instances of the TSP in special configuration 
(cities equally spaced in a circumference). It was observed that the values of this 
parameter fluctuate around zero at high temperatures when the system is highly 
disordered (large number of knots). On the other hand, at low temperatures, when the 
system is less disordered, the sign of the parameter remains unchanged and its value 
saturates in a manner which is instance dependent. The author established a transition 
temperature where the order parameter drastically changes its behavior suggesting 
that the starting point for the annealing schedule should be just above that transition 
temperature. 



3 How the Knot-Cracker Works? 

The knot-cracker, as we named the proposed operator, is essentially of crossover type. 
It produces an offspring by selecting subtours from one parent providing it preserves 
the current turning sense (i.e. the sign of the cross-product). We call the parent from 
which the last city has been taken the current parent and the other parent is the 
alternate one. The length of the subtour to be inherited from any parent depends on its 
capacity for retaining the turning sense. At the moment that the former subtour 
changes its turning sense, the following subtour can be inherited from the alternate 
parent. The shift from one parent to the other can be stated as follows. If the city 
proposed by the current parent produces a sign change in the cross-product and the 
next city in the alternate parent does not, the latter is included in the offspring. In that 
case the alternate and the current parents shift. If none of the parents offers a solution 
to a suspicious link, the city from the current parent is include in the offspring. Then a 
possible knotted sequence is inherited and the current and alternate parents remain 
unshifted. As usual, every city included in the offspring must be deleted from both 
parents. 

In the experiments two opposed effects are observed, allowing in the first case just 
one parent shift in a recombination (like in the one-point crossover) or in the second 
case, all the possible parent shifts that can be made (a multiple-point crossover). In 
the latter case we observed an accelerated convergence at the early stages of the run 
and a premature diversity lost. The opposite behavior is observed in the former case, a 
slower initial convergence rate and more diversity and better values at the end. As a 
compromise solution we adopted, as rule of thumb, to randomly allow all possible 
parents shifts with a probability that linearly decreases along the run. Otherwise just 
one shift is allowed. 

At the beginning of the crossover operation, the current parent, the starting city and 
the turning sense of the tour are selected at random. The reference point for the 
position vectors of the cities with which the cross-product is to be evaluated is also 
randomly selected. This point is chosen in the plane containing all the cities defined 
by the extremal ones, and is changed for each application of the operator. The first 
edge to be placed in the offspring is defined by the starting city and its successive 
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neighbor in the selected turning sense. In the following example the way the operator 
works is illustrated. Let’s take two parents: 

PI: abcdefgh 
P2:abcghefd 

and let’s suppose the initial current parent to be PI, the starting city is b and the cross- 
product is negative. The offspring begins with: 

O: b c 

To point out where each city comes from and deletion, we shall underline the cities 
in the parent from which it is inherited and we shall strike through it in the alternate 
parent, as follows: 

PI: abcdefgh 
P2: a b=© g h e f d 

The next city suggested by PI is d and the following in P2 is g. Suppose that the 
cross-product involving c (the last city placed in the offspring) and d keeps the 
turning sense, the offspring grows as follows: 

01: b c d 
and 



PI: a bcd efgh 
P2: ab= 0 ghe f4 

Accordingly to PI the next city is e and to P2 is a. Suppose that the cross-product 
involving d and e is positive and the one involving d and a is negative, then P2 
becomes the current parent and PI the alternate one. The individuals are: 

01: b c d a 
PI: a b c d e f g h 
P2: ab= 0 ghe f4 

The next city in P2 is g and e is the next from PI. If the turning sense is kept up 
when selecting g, the three individuals are now 

01: b c d a g 
PI: # b c d e fgh 
P2: ab= 0 ghe f4 

From the current parent P2 the next city is h, the same from P 1 . If the turning sense 
when going from g to h is not kept up, P2 remains the current parent and the offspring 
is possible knotted: 

01: b c d a g h 
PI: # b c d e fg^ 

P2: a b=e g h e f4 

Finally, let’s suppose that the original sense of turning remains unchanged by 
including the last two cities from P2, the three individuals are then: 



OLbcdaghef 
PI: a bcd efgh 
P2: a b=# g h e f 4 
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It can be observed that the subtour bed is inherited from PI and the subtour ghef 
comes from P2. The knot-cracker induces, as most sequencing crossover operators, an 
intrinsie mutation from the point of view of the adjaceney. In this example the new 
links ag and fb, that were not present in any parent, have been introdueed in the 
offspring. 

Last but not least, from the previous explanation of the offspring building process 
follows that the knot-cracker always produces legal offspring’s, and does not requires 
time eonsuming repairing teehniques. 



4 First Version: Implementation and Results 

All results reported in this paper were obtained with a steady-state-without-duplicates 
GA, which replaces the lowest ranked individual in the current population with the 
newest offspring. The initial population is randomly generated, the parents selection is 
roulette-wheel-based and the fitness is linear normalized. The first experiment to 
compare the performances of the EER and the knot-cracker was performed over the 
Oliver’s 30 cities problem, the same used by the authors of the EER in their seminal 
publication [5]. 

In order to make a fair comparison with our GA, we attempted to optimize the 
operators performance by tuning the population size. This parameter seems to be of 
paramount importance for both operators, in view of the fact that the two other 
parameters like the selective pressure is fixed for the fitness technique and the number 
of trials is large enough. To evaluate the Euclidean distance between two cities we 
follow the funetion proposed in [5]. The best known solution for this problem is 420 
units long. 

Results of 20 runs of 50000 trials appear in table 1. There we compare the 
population size required, the mean of trials to reaeh the best solution, the average of 
the length of the best tour and the worst solution found. In this problem the EER and 
the knot-cracker found the best known solution 20 out of 20 times, but the latter does 
it with 43% less iterations and a population 10 times smaller. In the figure 4 the 
evolution of the mean of the best values obtained over 20 runs can be observed. Note 
the difference of the slopes of both operators at the early stages of the run, which 
shows the ability of the knot-cracker to quickly make arise good solutions. 





EER 


Knot-cracker 


Population size 


1000 


100 


Trials allowed 


50000 


50000 


Mean of trials to best found 


17744 


12154 


Average 


420 


420 


Worst solution 


420 


420 



Table 1. Results on the Oliver's 30 cities problem over 20 runs. 
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Fig. 4. Best results average over 20 runs of the Oliver's 30 cities problem. 



5 Enhancing the Operator: Results 

A modification that enhances the knot-cracker performance ean be easily included. To 
determine the turning sense we calculate the sign of the cross-produet and, of eourse, 
its magnitude. This magnitude represents the area of the parallelogram whose sides 
are the position veetors from the reference point as shown in figure 5. In most eases 
the smaller this area is the eloser the eities are. This idea was used in the knot-eraeker 
enhanced version to seleet the city to be included in the offspring when no parent 
solves the change of turning sense. This modifieation adds a greedy toueh to the 
operator with no additional effort. 




Fig. 5. The magnitude of the cross-product represents the area of the shaded parallelogram. 

Table 2 shows the performanee of the enhaneed knot-eraeker over the same 30 
eities problem. The average of recombinations needed to reaeh the best value is 33% 
of those required by the EER and only 47% of those of the previous version of the 
operator. For these results no loss of eonsistency nor population size increment was 
necessary. Figure 6 illustrates the mean of the best values over 20 runs for the three 
operators. Here again the slopes shows that the heuristie behind the knot-eraeker 





322 Nestor Carrasquero and Jose A. Moreno 



increases the eonvergence rate without stagnation in spite of the smaller population 
size it requires. 





EER 


Knot-cracker 


E. knot-cracker 


Population size 


1000 


100 


100 


Trials allowed 


50000 


50000 


50000 


Mean of trials to best found 


17744 


12154 


5707 


Average 


420 


420 


420 


Worst solution 


420 


420 


420 



Table 2. Results on the Oliver's 30 cities problem over 20 runs. 




Fig. 6. Best results average over 20 runs of the Oliver's 30 cities problem. 

Other experiments were made with the enhanced knot-cracker and the EER. Table 
3 shows the results for Eilon’s 50 cities problem [5]. The mean of trials to find the 
best solution of the enhanced knot-cracker is 17% bigger than the one of the EER and 
the averages of those solutions are similar. The worst value found by the EER is 
2.58% over the best known solution and the enhanced knot-cracker is only 1.41%. 
Figure 7 illustrates the behavior of the best values over 20 runs. The slopes show that 
the heuristic behind the enhaneed knot-craeker inereases the eonvergence rate at the 
early stages of the runs without stagnation, in spite of the smaller population size it 
requires. The best known solution for this problem is 426 units long. 

The results for Eilon's 75 cities problem [5] are shown in tables 4. In this problem 
the mean of trials to find the best solution of the enhaneed knot-cracker is 11% 
smaller than the one of the EER and the averages are similar. The worst value found 
by the EER is 3.17% over the best known solution and the enhanced knot-cracker is 
2.80% above. The evolution of the best values over 20 runs is shown in figure 8. Here 
again the slopes show the higher eonvergenee rate of the enhaneed knot-eraeker. The 
best known solution for this problem is 535 units long. 
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EER 


E. knot-cracker 


Population size 


1000 


300 


Trials allowed 


100000 


100000 


Mean of trials to best found 


49146 


57579 


Average 


427.75 


427.45 


Worst solution 


437 


432 



Table 3. Results on Eilon's 50 cities problem over 20 runs. 




Fig. 7. Best results average over 20 runs of the Eilon's 50 cities problem 




Fig. 8. Best results average over 20 runs of the Eilon's 75 cities problem. 
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EER 


E. knot-craeker 


Population size 


2000 


500 


Trials allowed 


200000 


200000 


Mean of trials to best found 


162722 


146236 


Average 


542.05 


542.90 


Worst solution 


552 


550 



Table 4. Results on Ellon's 75 cities problem over 20 runs. 



6 Conclusions and Discussion 

In this paper we have proposed two versions of a new genetie operator for the 
generation of good TSP solutions with evolution programs. The basic heuristic behind 
the proposed operators is to try to produce lesser knotted offspring’s by mean of 
preserving a given turning sense. Since a turning sense can be associated to a sign of 
the cross-product of the position vectors of consecutive cities, a change of sign of this 
product can be a hint of knot existence. These operators do not need the evaluation of 
the cost of individual "inter-cities links" as other heuristic operators do [1]. This 
reason brought us to compare the performance of the operator against the EER, one of 
the best sequential non-biassed genetic operators [4]. The knot-cracker was designed 
without trying to any preserve position, order or adjacency, nevertheless the obtained 
results suggest that it must preserve one of those characteristics (adjacency at least). If 
this assumption is true it is just a consequence of the way the operator was built. 

Experiments were performed over the same benchmarks used by the authors of the 
EER in [5]. In the Oliver's 30 cities problem, both versions of the knot-cracker have 
shown better performance than the EER and the enhanced knot-cracker did best. In 
the Eilon's 50 and 75 cities problems the enhanced knot-cracker was compared with 
the EER and in all cases the proposed operator had better performance. This 
improvement stands on fewer number of iterations to reach good solutions with 
smaller population sizes and a lower dispersion of the best solutions found (i.e. our 
operator stagnates later). It is clear that the EER is a more general sequential operator 
whose applicability goes beyond the TSP. The knock-cracker has been designed with 
the geometrical idea of the TSP in mind; that is why it requires the coordinates of the 
cities. It seems that other analogies should be developed to apply the knot-cracker to 
more general sequencing problems. This point is being studied further. 
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Abstract. This article describes a new system for learning rules using 
rotated hyperboxes as individuals of a genetic algorithm (GA). Our 
method attempts to find out hyperboxes at any orientation by 
combining deterministic hill-climbing with GA. Standard techniques, 
such as C4.5, use hyperboxes that are aligned with the coordinate axes. 
The system uses the decision queue (DQ) as method of representing the 
rule set. It means that the obtained rules must be applied in specific order, 
that is, an example will be classify by the i-rule only if it doesn’t satisfy 
the condition part of the i-1 previous rules. With this policy, the number 
of rules is less because the rules could be one inside of another one. We 
have tested our system on real data from UCI repository. Moreover, we 
have designed some two-dimensional artificial databases to show 
graphically the experiments. The results are summarized in the last 
section. 



1 Introduction 

Supervised learning (SL) is used when the data samples have known outcomes that 
the user wants to predict. This type of learning is the more common form because 
data are usually collected with some outcome in mind. Human problem solving is 
normally an exercise in studying input conditions to predict a result based upon 
previous experience with similar situations. SL algorithms tend to emulate that sort of 
human behavior. 

Decision trees (DT) are a particularly useful tool in the context of machine learning 
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techniques because they perform classification by a sequence of simple, easy-to- 
understand tests whose semantics is intuitively clear to domain experts. Some 
techniques, like C4.5, construct decision trees selecting the best attribute by using a 
statistical test to determine how well it alone classifies the training examples [9]. This 
class of DTs may be called axis-parallel, because the tests at each node are equivalent 
to axis-parallel hyperplanes in the space. Others techniques build oblique decision 
trees (ODT), as OCl[7], that tests a linear combination of the internal attributes at 
each node, for that, these tests are equivalent to hyperplanes at an oblique orientation 
to the axes. 

At this point, we must remember that, in a domain with N training examples, each 



described using a k real-valued attributes, there are at most 




distinct k- 



dimensional oblique splits; however, for axis-parallel splits, there are only 
Nxk distinct possibilities, for that reason, it could exhaustively search the best split 
at each node. 

Anyway, to find out the smallest DT (axis-parallel or oblique) is a NP-hard problem 
[2]. Both methods use hill-climbing, that is, the algorithm never backtracks; therefore, 
it could be converging to locally optimal solutions that are not globally optimal. 

Simpson [12] introduced the idea of using hyperboxes to cluster or classify spatial 
data. Each hyperbox is viewed as a fuzzy cluster, a fuzzy set in which all of the 
elements within the hyperbox have membership 1.0 for being in that set, and elements 
outside the hyperbox can have a positive membership in the set depending on a fuzzy 
membership rule for that set. Simpson used a deterministic procedure to place and 
appropriately size hyperboxes to describe data. Hyperboxes were created and sized by 
considering the data in an ordered sequence. A hyperbox was placed around 
preliminary data. As subsequent data was added, either the present hyperbox was 
grown to include the new data, or a new hyperbox was added and the process 
continued. This procedure was of limited efficacy because it required trial-and-error 
setting of operator parameters and the final solution depended on the order of 
presentation of the data, even when the data possessed only spatial an not sequential 
characteristics. Fogel and Simpson used evolutionary programming to optimize the 
position of hyperboxes to cluster data in light of a minimum description length 
criterion (MDL). First, the experiments were restricted to evolving hyperboxes that 
were aligned with coordinate axes; and afterwards, they included the capability to 
rotate the hyperboxes. At this point, it is important to note that Fogel’s method try to 
solve the clustering problem, that is, unsupervised learning. 

Genetic algorithms (GA) employ a randomized search method to seed a maximally 
fit hypothesis [3, 4]. This search is quite different from other learning methods, like 
mentioned above. The GA search can move much more abruptly, replacing a parent 
hypothesis by an offspring less likely to fall into the same kind of local minima that 
can happen with the other methods. 

In previous works, we presented a system to classify databases by using hyperboxes 
(axis-parallel). This system used a GA to search the best solutions and produced a 
hierarchical set of rules. The hierarchy means that an example will be classify by the 
i-rule if it does not satisfy the conditions of the i-1 precedent rules. The rules are 
sequentially obtained until the space is totally covered. The behavior is similar to a 
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queue, for that reason we have ealled deeision queue (DQ) to the produeed rule set. 
This eoncept is based on the k-DL, the set of decision lists with conjuntive clauses of 
size at most k at each decision [1 1]. A decision list is a list L of pairs where each fj is 

a term in , each Vi is a value in {0,1}, and the last function is the constant 
function true. 



( 1 ) 

A decision list L defines a boolean function as follows: for any assignment xeXn, 
L(x) is defined to be equal to Vj where j is the least index such that ^(x)=l (such an 
item always exists, since the last function is always true). 

DQ is based on DL. Really, DQ is a DL-generalization because it permits 
codifying functions fj of continuous attributes and the values Vi can belong to any set. 

Futhermore, DQ does not have the last constant function true. However, we could 
interpret that last function as unknown function, that is, we do not know to which 
class the example belongs to. Therefore, it may be advisable to say "unknown class" 
instead of taking an erroneous decision. 

In the sense mentioned above, our system has a measure, called unknowledge, to 
indicate how many test examples have not an associated class. As the number of rules 
or the allowed error rate (relaxing coefficient) can be given by the domain expert, 
some unnecessary mistakes could be avoided if the rule set does not assign to the test 
example a class. Incrementing the relaxing coefficient the unknowledge will be less, 
but the number of misclassified examples will be higher. The expert, based on 
experimentation, must determine such parameter. 
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Fig. 1. Rotated versus axis-parallel hyperboxes. 

In this paper, we propose to hold the primitive structure of our early works [1,10], 
but changing the shapes that models the search space. That is to say, current work 
extends these previous efforts by including the capability to rotate the hyperboxes. 

We show in figure 1 an example, in which rotated hyperboxes can find out better 
solutions than axis-parallel hyperboxes they do. Decision queue policy is applied in 
order to reduce the number of rules. 
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With this DQ-method, there is no problem if the regions are overlapped. An extreme 
case is presented in the next figure. 




Fig. 2. Decision queues for overlapped regions. 

In the other hand, if we use axis-parallel techniques, the number of rules is very 
high. When does it apply one technique or the other one? In principle, it is not 
possible to know it, but it could be a good solution to explore the search space with 
axis-parallel and, increasingly, to try to rotate the best solutions. 




Fig. 3. Axis-parallel solution to the figure 2. 



The number of rules, in figure 3, is very high. The numbers represents parts of the 
regions found out by using rotated hyperboxes, as shows figure 2. 



2 Description 



2.1 Environment 

In order to apply GAs to a learning problem, we need to select an internal 
representation of the space to be searched and define an external function that assigns 
fitness to candidate solutions. Both components are critical to the successful 
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application of the GAs to the problem of interest. Information of the environment 
comes from a data file, where each example has a class and a number of attributes. 
The GA uses real codification; that is, an individual is formed by an n-tuple of real. If 
k is the dimension, an individual has exactly 3 x ^ values: 2 xk for the boundaries 
of each dimension; k-\ for the angles of rotations, in radians, anti-clockwise around 
the hyperbox centre; and 1 for the class. The next figure shows a n-tuple: 
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Fig. 4. Representation of an individual. 

where li and Ui represent the lower and upper bounds of the individual, 
respectively, for every dimension; 0i is the rotation angle; and class. In 2-dimension is 
possible to put an hyperbox at any orientation by using only one rotation; in k- 
dimension it is necessary k-I rotations. 

We consider that an example belongs to the area determined for an individual if it 
satisfies its condition part. Thus, let an example be given by ^j={pi, P2, •••. Ph c) then 
it will be into the defined region by the individual (or equivalently, a rule will be 
covered by the rule) indyi ={li, Ui, 1 2, U2, Ih Uk,0i,92,...,0k-i, class) if rotating the 
example V={pi, p2, , Pk) with the angles -9],-92,...,-0k-i with relation to the centre of 

the hyperbox defined by (/;, uj, h, U2, ... , Ih then the result P’ belongs to this 
hyperbox. 




Fig. 5. Rotation of R the angle 0 is equivalent to rotate P the angle -0 around the centre of R. 

For example, in three dimensions the example {pi, p 2, p 3, c) will be covered by the 
rule (/;, ui, I2, U2, h, u^, Oi, 62, class) if P’ satisfies 



k ^ p\ aI^< p\ <U 2 Ak < p\ < W 3 



( 2 ) 



where P’ is obtained as follows: 
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Let (mi, ni 2 , m3) - ((/;+ U])/2, (I2+ U2)/2, (I3+ us)/2) be the centre of the hyperbox 
defined by the rule, then the coordinates of P’ are: 



{p\ ,P'2 ’^'3 ) = (Pl~ ^vP2 ~ ^’P3 ~ 
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(3) 






and it will be correctly classify if its class is equal to c. 



2.2 Algorithm 

The algorithm is a typical sequential covering GA [6]. It chooses the best individual 
of the evolutionary process, transforming it into a rule, which is used to eliminate data 
from the training file [13]. In this way, the training file is reduced for the following 
iteration. A termination criterion could be reached when more examples to cover do 
not exist. 

The method of generating the initial population consists of randomly selecting for 
every individual of the population an example from the training file. After, it is 
obtained an interval to which the example belongs adding and subtracting a random 
quantity from the values of the example. Moreover, the angles are randomly 
generated between zero and n/2. Sometimes, the examples very near to the boundaries 
are hard to cover during the evolutionary process. For solving it, the search space is 
increased (actually, lower bound is decreased 5%, and upper bound is increased 5%). 
For example in 1 -dimension, let a and b be the lower and upper bounds of the 
attribute; then, the range of the attribute is b-a; now, we randomly choose an example 
(xi^ class) from the training file; last, a possible individual of the population could be: 

(x^ - range * , x^ + range * , class) (4) 

where ki and k 2 are random values belonging to [0,1], and class is the same of that 
of the example. 

The evolution module includes elitism: the best individual of every generation is 
replicated to the next one. A set of children is obtained from copies of the parents, 
randomly selecting it, but depending on their fitness values. The remainder is formed 
through crossovers. Afterwards, mutation is applied depending on a probability. 
Crossovers are specifically designed, choosing a value among one of the three 
segments formed inside the interval of the attribute by putting the two values of the 
individual as cross points. That is, for every attribute, an individual has two values, 
and then those values are partitioning the interval in three segments. We select 
randomly a value inside of a segment also randomly chooses. The next figure shows 
the procedure: 
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min min(a,b) max(a,b) max 

( 1 ) I ^ — 0 ^ 1 

(2) I OH ^ 1 

(3) I ^ ^ 0 — I 



Fig. 6. Crossover operator. 

One of the three types of erossover operator eould be applied, depending on a 
probability. The first one is more eonservative and seeond and third ones are more 
explorative. When the crossover is applied to any angle location, the first one is 
always used. 

Mutation is applied in two different ways: if the location corresponds to a value of 
the interval, then a quantity is subtracted or added, depending on whether it is the 
lower or the upper bound, respectively (the distance actually is the lower euclidean 
distance between any two examples); if the location corresponds to an angle, it is 
randomly generated another one. 

Furthermore, it is advisable to explore the space with the best individual, because 
we cannot know what angles are the best, and we cannot either know if an angle is 
going to be better than other is. In this way, we are using hill-climbing technique, 
what could find out best solutions; despite of inconveniences said in the introduction. 
The method consists of exploring close rotated regions with the same centre. The 
angles of the exploration belongs to the interval [-7i/6,7t/6], with an increment of 7t/30. 
Then, every best rule explores the search space with other ten regions around the 
same centre for each attribute. However, in order to reach better fitness value, the 
interval is also modified the same quantity as mutation used. 

To improve the best individual is a difficult task. If the fitness value is better, then 
the new angle replaces to the old, and one value of one attribute is modified as 
mentioned above. This method allows rotating an hyperbox using only one 
dimension. To explore 10 new orientations at the beginning (the first generations) can 
produce the typical problems of the hill-climbing methods. For that, we recommend 
to use few explorations at the start and increase it toward the final. Thus, the last 
generations explore more than the first ones. 

We next show an overview of the DQ-Classifier. 



While exists examples in training file 

Step 1. Initialize population 

Step 2 . Repeat num_generations times 
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Step 2.1. Evaluation 
Step 2.2. Select the best 
Step 2.3. Replication 
Step 2.4. Crossover and Mutation 
Step 2.6. Improve the best 
Step 3 . Put the best one in Decision Queue 
Step 4. Eliminate the covered ones by the best one 
Fig. 7. Overview of DQ-Classifier. 

A possible criterion to implement the mutation operator consists of distinguishing 
between mutation of values and mutation of angles, as two independent operators. 
Mutation of values could be a higher probability of application than mutation of 
angles, and also, to incorporate to the evaluation function the distance from the 
individual (rule) to the closer example of the same class. Thus, we can penalty the 
rotated rules with wrong angles; that is, the new individual is not going near to the 
closer example of the same class. 



2.3 Fitness Function 

The evolutionary algorithm minimizes the fitness function / for each individual. It is 
given by 

if G(i) *RC<=CE(i) then CE(i) =0 



/(/) 



^ , G{i) 

T 1 + CE{i) 



where T is the cardinality of the training file, U is a new factor called coverage (the 
rule coverage is the side of a k-dimensional hypercube which volume is equivalent to 
the volume of the covered k-dimensional region by the rule); CE(i) is the class errors, 
which are produced when the i example belongs to the region defined by the rule, but 
it does not the same class; G(i) is the number of goals of the rule; RC is the relaxing 
coefficient. Every rule can quickly expands for finding more examples due to V in the 
fitness function. 



2.4 Relaxing Coefficient 

Databases uses as training files have not areas clearly differentiated, for that, to 
obtain a rule system totally coherent involves a high number of rules. We show in 
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previous paper [1] a system capable of producing a rule set exempt from error rate; 
however sometimes, it is interesting to reduce the number of rules for having a rule 
set which may be used like a comprehensible linguistic model. When databases 
present a distribution of examples very hard to classify, then it is advisable to use a 
relaxing coefficient [10]. Many times, we are more interested in understanding the 
structure of the databases than in the error rate. In this way, it could be better a system 
with less cardinality (despite some errors) than too many rules (with 0% of error rate). 
Then, it may be interesting to introduce the relaxing coefficient for understanding the 
behavior of databases by decreasing the number of rules. RC indicates what 
percentage of examples inside of a rule can have different class to the rule. RC 
behaves like the upper bound of the error with respect to the training file, that is, as an 
allowed error rate. 

To deal efficiently with noise and find a good value for RC, the expert should have 
an estimate of the noise percentage in its data. 



3 Application 



3.1 Ex Profeso Databases 

We have designed some databases of varying complexity to show graphically the 
experiments. These databases are shown in the fig. 8. Results are in table 1. 




Fig. 8. Ex profeso databases named DBl, DB2 and DB3. 



3.2 Databases from UCI Repository 

The experiments described in this section are from UCI Repository [7]. We use five 
cross validation in all our experiments to estimate classification accuracy. This cross 
validation experiment consists of the following steps: randomly divide the data into 
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two disjoint partitions (70% an 30%); build a rule set using 70% of data and test the 
rule set with 30% of data; for each partitions the number of correct classification of 
the rule set are all counted and divided it by the number of instances of the test file to 
compute the classification accuracy; the five values are summed and divided by five; 
report one hundred minus this accuracy multiplies by 100 and the average of the 
number of rules. We have chosen C4.5 to compare the results, also with five 
experiments and cross validation which are shown in Table 2. 



DATABASE 


C4.5 


DQ-CLASSIFIER- 

RH 




ERRO 

RRATE 


NUMBER 
OF RULES 


ERROR 

RATE 


NUMBER OF 
RULES 


DB 1(200,2,2) 


12.9 


17 


3.3 


3 


DB2(200,2,2) 


16.25 


12.5 


2.7 


2 


DB3(200,2,2) 


12.13 


11 


6.6 


3 



Table 1. The results of artificial databases. 



DATABASE 


C4.5 


DQ-CLASSIFIER- 

RH 




ERRO 

RRATE 


NUMBER 
OF RULES 


ERROR 

RATE 


NUMBER OF 
RULES 


IRIS (150, 4, 3) 


6.3 


4.4 


4.83 


3.6 


BREAST CANCER (683, 9, 


13.8 


5.2 


5.38 


2.2 


PIMA (768, 8, 2) 


28.4 


77.6 


26.35 


17.4 


WINE (178, 13,3) 


7.2 


5.0 


10.1 


6.4 



Table 2. Databases ( number of examples, dimension, number of classes). 



It is very important to note that every execution has been realized with a population 
of 50 individuals and 50 generations. Very low numbers considering the number of 
examples and number of dimensions of the databases. 

Decision queue is very relevant in relation to the number of rules. 
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4 Conclusions 

A supervised learning tool to classify databases with rotated hyperboxes is 
presented in this paper. It produces a decision queue where conditions of each rule 
indicate if an example belongs to a rotated hyperbox. The number of rules is reduced 
with regard to other systems, like C4.5; and improves the flexibility to construct a 
classifier varying the relaxing coefficient. 
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Abstract. Spatial layout design problems are related to grouping ob- 
jects with similar properties, assigning objects to pre-defmed groups, 
positioning objects in a constrained space, as well as finding a path 
connecting certain objects in a tri-dimensional world. Representation, 
reasoning and documentation for spatial layout design problems are ex- 
pensive, inconsistent, incomplete and imprecise. This paper proposes a 
framework called SpADD (^atial Active Design Documentation) to 
assist designers in spatial layout design tasks, from a set of objects and 
a set of spatial constraints relating these objects. SpADD is based on an 
ADD (Active Design Documentation) approach extension, a canonical 
parametric network model, an engineering decision-making model and 
an object oriented class model applied to spatial layout design tasks. 
Initial results of using an implemented version of SpADD for prelimi- 
nary design of oil pipeline layout in deep water oil fields are discussed. 



1 Introduction 

Many problems in engineering design involve grouping objects with similar prop- 
erties, assigning objects to pre-defmed clusters, positioning objects in a constrained 
space, as well as finding a path connecting certain objects, respecting a set of con- 
strains and following a set of criteria. This tasks are known as SLDP (spatial layout 
design problems). Generally, the world surrounding the objects involved in SLDP is 
complex, dynamic and overloaded with information. As the size of the universe 
grows, the number of relevant parameters to be considered drastically increases, im- 
poverishing the designers’ perception of the problem. In addition, thresholds and 
horizon effects are seldom noticed by designers. 

Even for ordinary designs, the complexity of representation, reasoning and docu- 
mentation for SLDP in a constantly changeable world overloaded with spatial infor- 
mation is large. In addition, treatment of spatial issues as cyclic dependencies and 
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canonical parametric networks is very difficult due to the overwhelming amount of 
information. 

In this work we propose a framework called SpADD ( Sp atial Active Design 
Documentation) to enhance GIS (Geographic Information Systems) with intelligent 
mechanisms to deal with spatial layout design tasks in a domain where the world 
surrounding the objects is complex, dynamie and overloaded with information. 

The main objectives of this researeh are: 

• study the issues related to SLDP; 

• create a framework to support spatial layout design reasoning which extends 
ADD (Aetive Design Documentation) approach [5]; and 

• implement a SpADD prototype to show its feasibility to deal with SLDP. 

Two main approaehes have been used to treat problems involving spatial reason- 
ing: quantitative and qualitative. The quantitative approaeh [2] emphasizes a preeise 
description of the objects geometry. The qualitative approaeh [9] [6] describes the 
relations among objects by a finite set of symbols. For example, the set of symbols 
{North, South, East, West} denotes a system of qualitative directions. The semantics 
of these symbols can vary, depending on the context and the seale. The need for pre- 
cision in layout design caleulation led us to follow the quantitative approach in mod- 
eling the domain. However, since any aetive document must generate decision’s ex- 
planation to users, qualitative relations, such as further, greater, and less expensive, 
are employed to provide them. 

This paper gives an overview of our researeh starting by presenting in Seetion 2 the 
issues related to SLDP and illustrating the problem with an example. Since SpADD is 
an ADD’s extension. Section 3 presents an overview of the main aspects of ADD’s 
approach. Section 4 presents the SpADD framework and Section 5 demonstrates 
ADDSUB, an SpADD ’s working prototype for oil pipeline design. The remaining 
Sections are eoneerned with related work and conclusions. 



2 Spatial Layout Design 

Design problems can be described as a complete speeification of a set of components 
and their relations so as to satisfy a set of constraints [3]. In SLDP, the design speeifi- 
cation consists of the list of selected objeets, their clustering, their specifie position 
and the path connecting them. Consequently, the SLDP design task consist of: 

• object clustering [8] [7] [I]: the task of grouping similar objects respecting a set 
of constraints and following a set of eriteria; 

• object assigning: the task of assigning an object to a cluster; 

• object positioning: the task of locating objects in a constrained space; and 

• route-finding [10]: the task of finding the shortest path among located objects in a 
constrained world. 

In SLDP, the deeision order affeets the design context. For example, if objeet A is 
positioned before object B, the loeation of object A is considered obstacle (eonstraint) 
to the position of objeet B. In the other hand, if object B is positioned before object A, 
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the location of object B is considered obstacle (constraint) to the position of object A. 
This fact is valid to anyone of the type of decisions in SLDP. 

The oil pipeline layout in deep water oil fields illustrate spatial layout design. Oil 
fields in deep water exploration needs a special process. The oil is pumped and 
drained to offshore platforms where it is treated to be exported to land. Given a set of 
wells with their target regions (objectives) and a number of oil exploration units (plat- 
forms), the oil pipeline layout task consists of: 

• Find the best grouping of wells, considering the distance between the objectives’ 
geometric centers (GC); 

• Assign each platform to a well suited cluster, considering the maximum platform 
capacity and the maximum platform number of risers to receive the oil; 

• Locate the well heads in each objective, in order to minimize the distance to the 
platform the well is associated; 

• Locate each platform in a free area as close as possible to its cluster’s GC; 

• Define the pipeline routing that drains the oil from each well to the assigned 
platform, considering the existing obstacles. 

Partial grouping, assigning, wells’ and platforms’ positions may be an input data 
too. Consequently, they may impose constraints to the process. The decision ordering 
also influence the process. Fig. 1 illustrates a submarine arrangement resulting from 
the process described above for two platforms and six oil wells. 




Legend: 


V 


Platform 




Objective 


— 


Pipeline Route 


X 


Oil weU 



Fig. 1. Representation of a oil pipeline layout design. 

The complexity of SLDP in a constantly changeable world overloaded with spatial 
information motivates our efforts in the creation of a framework (SpADD) to assist 
designers in such problems. Since SpADD is an ADD’s extension. Section 3 presents 
an overview of the main aspects of ADD’s approach. 
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3 The ADD Approach 

The ADD (Active Design Documentation) approach uses the apprentice metaphor. 
Let’s assume that a company hires an apprentice to observe and record a designer 
developing a project. The apprentice knows the basic engineering knowledge. He 
observes the designer and creates a project model with enough information to explain 
it even without the designer. While the apprentice expectations of the designer’s deci- 
sions match, no interruption is needed. 

The ADD (Active Design Documentation) approach uses the apprentice metaphor. 
Let’s assume that a company hires an apprentice to observe and record a designer 
developing a project. The apprentice knows the basic engineering knowledge. He 
observes the designer and creates a project model with enough information to explain 
it even without the designer. While the apprentice expectations of the designer’s deci- 
sions match, no interruption is needed. 

Whenever the apprentice’s expectation fails, he interrupts the designer to learn or 
to advise the designer of a constraint violation. Since this apprentice has limited 
knowledge, he can be substituted by a knowledge base and a knowledge acquisition 
procedure. ADD is this assistant agent that integrates design and documentation. 

ADD uses a parametric network to represent domain knowledge. Nodes are design 
parameters and arcs are dependencies among them. Parameters represent the context 
and dependencies represent which are the parameter’s sub-parameters, i.e., those 
parameters whose values need to be known before the calculation of a parameter’s 
value. A parameter only can be calculated if all its sub-parameters have been already 
calculated. 

The parameters can be of three types: primitive, derived or decided. A primitive 
parameter owns an independent value associated to a specific case of project. The 
derived parameter is a variable derived through deterministic formula (mathematics or 
heuristics). The decided parameter is a variable where alternatives are generated, 
constraints are applied to eliminate alternatives and criteria are applied to order them. 
Derived and decided parameters own a reference to the parameters they depend to be 
calculated, their sub-parameters. 

4 The SpADD Framework 

SpADD is an ADD extension including a canonical representation model and a 
object oriented class model to support spatial reasoning needed in spatial layout de- 
sign problems. Section 4.1 presents the SpADD’s knowledge base, emphasizing its 
characteristics to deal with the issues identified in layout design, and Section 4.2 pres- 
ents SpADD’s reasoning process, emphasizing the additional features added to the 
original ADD model. 

4.1 SpADD’s Knowledge Base 

The SpADD’s knowledge base involves three issues: 

• An ADD’s knowledge base extension to treat spatial reasoning problems 

• A canonical knowledge model to treat spatial layout design problems; 

• An object oriented class model applied to the SpADD’s knowledge base and 
reasoning. 
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SpADD’s ADD Knowledge Base Extension. The ADD knowledge base has infor- 
mation about the project domain and about the ADD’s decision process. The original 
ADD knowledge base is linear, discrete and static. It has some limitations that hinder 
its use to deal with multiple methods for alternative design generation, cyclic depend- 
encies, dynamic dependencies and canonical parameters and dependencies. 

Multiple methods for alternative design generation. Some layout design decisions, 
such as objects’ clustering, have more than one alternative generator method for the 
same parameter. For example, different clustering methods generally produce differ- 
ent solutions based on the same input data [1]. It would be interesting to run more 
than one program and to analyze and compare the resulting classifications. ADD uses 
only one method for alternative design generation for each parameter. SpADD allows 
the use of multiple methods for alternative design generation for each parameter. Thus 
it can also be used as a tool for evaluating different methods of alternative design 
generation. 

Cyclic dependencies. ADD has references only to acyclic dependencies. Acyclic de- 
pendencies appears when one parameter influences another in a only way. However, 
spatial information treatment requires cyclic dependencies. Cyclic dependencies ap- 
pears when one parameter influences another and vice-versa. In the oil pipeline layout 
domain, a well is positioned at the location as closest as possible to its assigned plat- 
form. On the other hand, a platform is located at the GC of the wells’ locations, con- 
figuring a net of cyclic dependencies. 

Dynamic dependencies. In layout design, dependencies may become known during 
the decision process. For example, a well depends of the platform’s position to whom 
it will be assigned to be positioned. However, its platform only is known after the 
assigning decision of platforms to groups, in execution time. Therefore, parameter’s 
dependencies can be classified in static or dynamic. The parameter’s static dependen- 
cies are known previously by the parametric canonical model and are always present 
during the parameter’s time-life. The parameter’s dynamic dependencies are known 
during the decision process and are set by a parameter’s static derived sub -parameter. 

Canonical parameters and dependencies. In layout design, canonical parts of the 
parameter network can be reproduced dynamically in execution time, i.e., dependen- 
cies may become known during the decision process, and parameters or dependencies 
may be created during the data input. For example, even though a object description 
knowledge is available, each problem has a different number of objects: parameters 
and dependencies will be created during the data input. SpADD uses a canonical 
knowledge network to represent what parameters and dependencies exist previously 
or will be created during the data input or reasoning process. It is presented following. 

SpADD ’s Knowledge Network Model. SpADD canonical parametric network’s 
model (Fig. 2) is a quantitative model applied to SLDP, allowing the decisions’ ex- 
planations. In Fig. 2 the parameters represent the context (the space, the existing ob- 
jects , the forbidden regions) and the spatial layout decisions (object clustering , as- 
signing, positioning and path finding). 
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SpADD canonical network considers the existence of two types of world objects, 
differentiated by their possibility to be assigned to cluster: the simple objects (those 
that cannot be assigned to clusters) and the composite objects (those that can be as- 
signed to clusters). 
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Fig. 2. SpADD knowledge network model. 



The derived parameter Obji represents a object of the spatial problem. It has as sub- 
parameters its relevant properties, such as shape, type, area and objective. The primi- 
tive parameter Shape Obji represents the shape of the Obji The primitive parameter 
Type Obji represents the type of the Obji. The primitive parameter Objective Obji 
represents the area in which the Obji can be positioned. 

The derived parameter Simple Objects concentrates as sub-parameters all type- 
simple objects. The derived parameter Composite Objects concentrates as sub- 
parameters all type-composite objects. The derived parameter Obstacle Objects con- 
centrates as sub-parameters all type-obstacle objects. 

The derived parameter Spatial Problem is a parameter whose function is to con- 
centrate as sub-parameters all decisions of a spatial problem such as, such as: objects’ 
clustering, assigning, positioning and path finding. The derived parameter Objects’ 
Positioning is a parameter whose function is to concentrate as sub-parameters all 
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objects’ positioning decisions. The derived parameter Objects' Clustering is a pa- 
rameter whose function is to have as sub-parameter the clustering decision. The de- 
rived parameter Objects ' Assigning is a parameter whose function is to have as sub- 
parameters the assigning decision. The derived parameter Path-Finding is a parameter 
whose function is to concentrate as sub-parameters all decisions of finding a path 
between a object and the composite object assigned to the cluster where the object is. 

The decided parameter Position Obji represents the position of a given object. It 
depends on the object itself, the other objects’ positions (attracting objects), the path 
between the object and its destination and the world existing obstacles. The cyclic 
dynamic dependence between the parameter Position Obji and its attracting objects is 
carried dynamically when the position in being calculated. These dependencies are 
represented by a hatched line. The obstacles in the positioning decision time is ob- 
tained in the parameter Obstacles Pos Obji. The occupied space is dynamically ob- 
tained consulting the objects already positioned. 

The decided parameter Clustering represents a set of objects sharing similar prop- 
erties. The clustering process depends upon the obstacles and the simple and com- 
posite objects. In the clustering decision several clustering methods can be applied. 
This methods can be initially classified into two kinds, namely partitioning and hier- 
archical methods [7]. A partitioning method constructs a single partition with a de- 
fined number of clusters. Hierarchical algorithms deal with all number of clusters, 
from 1 to the number of objects, in the same run. The objects will be clustered are 
always the type-simple objects. If the clustering type is partitioning the number of 
clusters is defined by the number of type-composite objects. 

The decided parameter Assigning represents the selection of composite objects to 
each cluster obtained. The assigning process depends upon the clustering, the obsta- 
cles and the available composite objects. Each object is assigned only to one cluster. 

The decided parameter Path Obji represents the path between Obji and the object it 
was assigned. This parameter may have different algorithms to create a path between 
the two objects and a evaluation criterion to select the one with minimum crossing or 
minimum number of direction changes. Before finding a path between the objects, it 
is necessary to identify the connection points, origin {Position Objj), destination {Po- 
sition Objj) and the obstacles to be avoided. Only after to know object destination, it 
will be possible to establish the dependence between them. This dependence is said to 
be dynamically ascertained. The obstacles in the path decision time is obtained in the 
parameter Obstacles Path Obji 

The derived parameter Destination Obji represents the object which Obji was as- 
signed, and the derived parameter Group Obji represents the objects assigned to the 
group of the Obji, case Obji is type-composite. 

SpADD’s Object Oriented Class Model. In addition to the SpADD’s canonical 
knowledge network model, the SpADD’s framework also presents a SpADD’s object 
oriented class model to support all SpADD’s reasoning process, presented in Fig. 3. 
The main classes of the SpADD’s object oriented class model are the primitive, de- 
rived and decided parameters classes. When a object is created, several instances of 
these classes are created according with the canonical model of the Fig. 2. These 
instances are related to form the parametric network of a specific problem 
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In Fig. 3, thick lines represent inheritance between classes and thin lines the possi- 
bility of creating instances of the classes. The class Parameter Manager has methods 
to create instances of alternative values and of primitive, derived and decided pa- 
rameters. The class Evaluator has methods responsible to calculate and evaluate the 
parameters’ values. The class Knowledge Base is responsible to keep the problem’s 
knowledge base. The class Canonical Knowledge Base has the knowledge about the 
SpADD problem to be solved, such as the parametric canonical network, and the 
methods to calculate each derived and decided parameter’s value. This class is par- 
ticular to each problem. The class User Interface has the methods for the user inter- 
face including a method to create a instance of the spatial problem. 




Fig. 3. SpADD’s object oriented classes model. 



4.2 SpADD’s Reasoning 

Reasoning in SpADD follows, except over cyclic dependencies, ADD’s decision 
process, i.e., parameters can be evaluated only after their sub-parameters had been 
evaluated, in a depth-first search style. Decisions involve generating alternative val- 
ues and evaluating them through a rational decision-making process. Alternative val- 
ues in layout design domain are generated, instead of previously listed, using a diver- 
sity of generation algorithms, such as agglomerative nesting procedure [7] to generate 
objects’ clustering. 

SpADD uses an iterative resource-bounded procedure to treat the cyclic dependen- 
cies and generate alternative values. Each parameter that owns a cyclic dependence 
must have an initial hypothetical value. SpADD imposes an ordering on a cyclic pa- 
rameter evaluation sequencing. First, all its acyclic parameters are calculated, like in 
ADD reasoning, considering the hypothetical value for the parameter. Afterwards, a 
new hypotheses is calculated to the cyclic parameter. If the difference between the 
new parameter value and the old one is significant, all cyclic parameters are re- 
calculated. The evaluation loop continues until the difference between new and old 
hypotheses are less than an accepted value or after a certain number of cycles. This 
behavior guarantees search will end. 
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5 ADDSUB: A Working Prototype Applied to Assist Design and 
Documentation of Oil Pipeline Layout 

Oil pipeline layout in deep water oil fields problem has been a good lab to test the 
feasibility of SpADD. We built a computational model, called ADDSUB, a working 
prototype of SpADD applied to assist design and documentation of oil pipeline lay 
out. 

As described in introduction section, in the oil pipeline layout, the designers’ 
tasks consists of identifying groups of oil target, positioning platforms to receive and 
treat the oil coming from a group, positioning wells in the target area to be as close as 
possible to the platform, and routing the oil pipeline to connect wells to platforms. 




Fig. 4. ADDSUB ’s oil target areas as data input screen dump. 

ADDSUB offers a design environment where the undersea topography and tex- 
ture, as well as, the existing objects are presented in a canvas area, as illustrated in 
Fig. 4. New oil target areas can be created and positioned by direct manipulation. 
Other input data are entered through wizards and dialog boxes. 

Fig. 5 illustrates an oil pipeline layout design. Big circles represented the target oil 
areas, the small gray circles, the oil wells; the small rectangles, the platforms; the slim 
rectangles, the pipelines; the lines, the topography; and the rest are obstacles to the 
design. 
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Fig. 5. ADDSUB’s decisions screen dump. 




Fig. 6. ADDSUB’s explanation screen dump. 

ADDSUB operates in five different modes.- Data-Entry, Suggest, Verify, Knowl- 
edge Acquisition and Explanation modes. In the Data-Entry mode, users input the 
data configuring the project to be developed. In the Suggest mode, users request sug- 
gestion from the system on the layout decisions. Fig. 5 illustrates a system output 
after an user requisition of the pipeline routing. In the Verify mode, users propose 
solutions and the system analyzes them. In this mode, partial solutions are considered, 
such as defining a subset of target areas that must be together. In the Knowledge Ac- 




SpADD: An Active Design Documentation Framework Extension 347 



quisition mode, users may inelude or modify ealeulation methods, alternative solu- 
tions and design criteria. In the Explanation mode, users can obtain decision explana- 
tions. As shown in Fig. 6, the upper area presents the chronological order of the deci- 
sions, the lower right screen area presents the domain knowledge, the lower left 
screen are presents the considered alternative solutions, and the middle area presents 
the explanation. 

Initial results demonstrated the success of the partnership designer and computa 
tional agent to find better solutions. Computational assistance was specially wel- 
comed due to the threshold and horizon effects characteristics of the domain. Small 
changes in the object location turn almost optimum solution in unfeasible one. In 
addition, the platforms, pipeline and wells high cost made a small improvement in the 
layout leading to a great savings (in the order of thousands to million dollars savings). 



6 Related Work 



Improving design documentation and rationale for spatial layout design problems is a 
recent research topic in engineering, computer science and information systems. 
Many researchers in the area of spatial reasoning have studied each aspect of spatial 
layout design (clustering, assigning, positioning, route-finding and documentation). 

Methods for each one of the spatial layout design decisions have been done. For 
example, over the last 3 decades, a wealth of algorithms and computer programs has 
been developed for cluster analysis [8] [7] [1]. In 1990, Kaufman introduced the main 
approaches to clustering and provided guidance to the choice between six methods 
[7]. In the path routing decision, we can use the Dijkstra’s algorithm [4] for the com- 
putation of the shortest path between two given points on a network. In 1995, Beattie 
[2] presented her „tesseral“ addressing, a alternative method for localization of points 
in space able to represent the domain in a uni-dimensional sequence. Finally, in 1992, 
Garcia presented her active design documentation model avoiding geometric reason- 
ing [5]. These research efforts have treated specific aspects of spatial layout design in 
isolation from each other. 

We have studied all this approaches and joined what they had of better to propose 
an approach to deal with SLDP, the SpADD approach. 



7 Conclusions 

In this paper we presented the SpADD framework, an extension of ADD design 
model applied to spatial layout design problems. Cyclic dependencies, unknown de- 
pendencies, independent decision ordering, multiple methods to alternative design 
generation, a canonical knowledge network and an object oriented class model were 
addressed by SpADD to be applied to spatial layout design problems. 

An additional contribution of SpADD is its use as a tool for evaluating different 
methods of alternative design generation. For example, the clustering problems re- 
ceives a great deal of attention from the optimization community ([8], [7], [1]). 
SpADD could be used as an evaluation tool contrasting the methods over a set of 
metrics. 

A prototype system was developed for the domain of oil pipeline layout showing 
the feasibility of the approach. 
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We are currently working on clustering methods and on automatic explanation 
generation. We are studying the use of an auxiliary qualitative model to interpret 
quantitative decisions and generate understandable design decisions rationale to users. 
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Abstract. Currently, there is a considerable body of experience in build- 
ing ontologies. Nevertheless, knowledge acquisition using ontologies is 
still a research issue. The goal of this paper is to take a further step 
towards a systematic approach for building ontologies. An approach for 
engineering ontologies is presented with a case study. This approach in- 
corporates the best features of the existing methods and proposes other 
features, such as the use of a graphical language for expressing ontolo- 
gies, an axiom classification and some guidelines for ontology capture, 
formalization, evaluation and documentation. An ontology development 
process model is also discussed, showing how to proceed in the develop- 
ment of ontologies. 



1 Introduction 

Traditionally, the development process of Knowledge-Based Systems (KBSs) was 
viewed as a process of extracting knowledge from a human expert and transfer- 
ring this knowledge into a KBS. This transferview, however, proved to be little 
productive. Nowadays there is a consensus that the KBS development is indeed, 
a modeling activity and many methods were proposed in this sense, most of 
them emphasizing the task modeling, such as [1] and [2]. 

More recently, the domain knowledge modeling started to deserve more at- 
tention and ontologies have played an important role. In spite of ontologies are 
being even more used, the engineering of ontologies is yet a research field. Sev- 
eral efforts have been made in order to define a systematic method for building 
ontologies and some of them have yield worthy contributions, such as [3], [4] 
and [5]. 

The goal of this work is to take a further step. Section 2 discusses the use 
of ontologies in knowledge acquisition. Section 3 presents a graphical language 
for expressing ontologies. Section 4 presents a systematic approach for building 
ontologies. In section 5, a case study is presented. Section 6 discusses related 
works. Finally, in section 7, the conclusions of this work are presented. 

Helder Coelho (Ed.): IBERAMIA’98, LNAI 1484, pp. 349-360, 1998. 
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2 Ontologies 

It is impossible to represent the real world, or even some part of it, with all 
details. To represent some phenomenon or part of the world, that we call domain, 
it is necessary to focus on a limited number of concepts that are sufficient and 
relevant to create an abstraction of the phenomenon in hand. Thus, a central 
aspect of any modeling activity consists of developing a conceptualization: a set 
of informal rules that constrain the structure of a piece of reality, which an agent 
uses to isolate and organize relevant objects and relations [6]. 

An ontology is a specification of a conceptualization [5], that is, a description 
of concepts and relations that can exist for an agent or an agent community. 
Basically, an ontology consists of concepts and relations, and theirs definitions, 
properties and constraints expressed as axioms. An ontology should not be only 
an hierarchy of terms, but a framework talking about the domain. 

One of the main benefits of the use of ontologies in the KBS development 
is the opportunity to adopt a more productive strategy to the Knowledge Ac- 
quisition (KA). In the traditional KA, for each new application to be built, a 
new conceptualization is developed. It reflects on how the KA is carried out: 
for each new KBS, an acquisition phase is accomplished, almost always from 
scratch, focusing all particularities of the system in hand. This approach how- 
ever, is extremely expensive. As long as KA is the activity that requires the 
major efforts in the KBS development, it is important to share and reuse the 
captured knowledge. 




Fig. 1. (a) Traditional approach to KA and (b) Ontology-based approach to KA 



In an ontology based approach, the KA can be accomplished in two stages. 
First, the general domain knowledge, relevant to several applications, should be 
elicited and specified as ontologies. These, in turn, are used to guide the second 
stage of the KA, when the particularities of a specific application are considered. 
In this way, the same ontology can be used to guide the development of several 
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applications diluting the costs of the first stage of KA and allowing knowledge 
sharing and reuse. Fig. 1 shows these two approaches. 

3 A Graphical Language for Expressing Ontologies 

In the KA, the use of a graphical representation is essential in order to facilitate 
the communication between knowledge engineers and experts. In ontology build- 
ing, such representation is basically a language representing a met a- ontology. So, 
this language must own basic primitives to represent a domain conceptualization 
and, in its simplest form, it should have notations to represent only concepts and 
relations. 

Nevertheless, some types of associations have a strong semantics and, indeed, 
hide a generic ontology. For each one of these types, a specialized notation can 
be proposed. In fact, this is the striking feature of the Graphical Language for 
Expressing Ontologies (GLEO) presented here and what makes it different from 
other graphical representations: any notation, beyond the basic notations for 
concepts and relations, aims to incorporate a theory. In this way, axioms can be 
automatically generated. 

In the current version, there are special notations for whole-part and sub- 
type associations. When we use a whole-part relation, we are incorporating a 
generic ontology of composition to the ontology in development. To represent 
this kind of relation, we used a filled line with a small circle close to the whole. 
When we develop a taxonomy of concepts, we are implicitly committing to a 
subsumption ontology. Since the sub-type association occurs between concepts 
and not between instances, we used a dotted line to represent this kind of re- 
lation, with an arrow pointing to the super-type. Fig. 2 summarizes the main 
notations of GLEO. 
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Fig. 2. Main notations of GLEO 



The use of special notations to enhance different types of relations can be 
a direct way to integrate generic ontologies with domain ontologies. A tool for 
ontology development that adopt this philosophy embeds a powerful theory in- 
clusion mechanism. Each type of relation specifying a generic theory can have 
its own notation and whenever it is used, ontologies would be automatically 
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integrated. In this sense, GLEO should not be considered a static and finished 
representation: when a theory talking about some kind of association is identi- 
fied, a specific notation can be included. 

4 A Systematic Approach for Engineering Ontologies 

Although there are in the literature some proposals for building ontologies, such 
as [3] and [4], this field is still object of several efforts. The goal of this section 
is to go one step ahead to transform the ontology development from art to 
engineering. The activities of the ontology development process are discussed 
and some guidelines about how to perform them are presented. Besides, a life 
cycle is proposed, showing the interactions between the activities. 

Basically, the proposed method wrap the following activities: purpose iden- 
tification and requirement specification, ontology capture and formalization, in- 
tegrating existing ontologies, and ontology evaluation and documentation. 



4.1 Purpose Identification and Requirement Specification 

The first activity to be performed in the ontology development process is to 
clearly identify its purpose and its intended uses, that is, the competence of 
the ontology. The competence of a representation is concerned with the span 
of questions it can answer [4] or the tasks it can support. By establishing the 
competence, we reach an effective way to determine what is relevant to the 
ontology and what is not. We should also identify the scenarios that motivated 
the development of the ontology in hand. 

Given the ontology purpose, we should specify its requirements. These should 
take into account the intended uses and can be stated as competency questions: 
the questions that the ontology should be able to answer [4]. By specifying a 
relationship between the competency questions and the motivating scenarios, 
we give an informal justification for the ontology and, what is better, we provide 
a way for its evaluation. 

If the domain of interest is too much complex, we should use some decomposi- 
tion mechanism to dilute this complexity. An interesting approach is to consider 
leveled ontologies: we start building core ontologies, basic for the domain in 
study; from these ontologies, others of high level are built, adding new elements 
and, thus, extending the ontologies of the lower level. 



4.2 Ontology Capture 

Doubtlessly, this is the most important step in the ontology development. The 
goal is to capture the domain conceptualization based on the ontology compe- 
tence. The relevant concepts and relations should be identified and organized [3]. 
A model using a graphical language, with a dictionary of terms, should be used 
to facilitate the communication with the domain experts. 
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Primitive concepts - those that cannot be defined in terms of other concepts 
in the ontology - should be defined using natural language and examples. The 
choice of the terms to be used to make reference to the knowledge categories 
should be carefully made, avoiding terms with cloudy interpretation. Concepts 
that can be described using other concepts should clearly refer to them. We 
should also develop taxonomies, organizing domain knowledge classes and sub- 
classes. 

Concepts and relations are the basis of an ontology, but an essential feature 
is the definition of axioms. Simply proposing a taxonomy or a set of basic terms 
does not constitute an ontology. Axioms should be provided in order to fix the 
semantics of the terms. Axioms specify concept definitions and constraints over 
their interpretation. In this step, it is not necessary to write down formal axioms, 
rather these axioms should be written in natural language, considering only the 
domain constraints. 

The axioms in an ontology can present two different forms and purposes: 
derivation axioms and consolidation axioms. Derivation axioms are those that 
allow new information to be derived from the previously existing knowledge. So, 
they are a way for deduction and represent logical consequences. Consolidation 
axioms^ on the other hand, typically define constraints for establishing a relation 
or for defining an object as an instance of a concept. 

The derivation axioms can have root in the meaning of the concepts and re- 
lations of the ontology or in the way these concepts and relations are structured. 
When axioms are defined to show constraints imposed by the way concepts are 
structured, we call them epistemological axioms. When they describe domain 
signification constraints, we call them ontological axioms. 

This classification based on the nature of the axioms is a good guideline to 
drive the axiom definition. We should pay attention to capture axioms that con- 
sider the structuring of the concepts and relations (the epistemological axioms) , 
their meanings and constraints (the ontological axioms) and the integrity laws 
that govern them (the consolidation axioms). 

The process of defining axioms should be guided by the competency ques- 
tions. The axioms in the ontology must be necessary and sufficient to express 
the competency questions and to characterize their solutions. Any solution to 
a competency question must be entailed by or consistent with the axioms in 
the ontology. If the proposed axioms are not enough for this purpose, then ad- 
ditional concepts, relations or axioms must be added to the ontology. In this 
sense, the ontology capture is an iterative process, strongly linked with the eval- 
uation. There may be many different ways to axiomatize an ontology and the 
competency questions should be used to evaluate the completeness of the sets of 
axioms in a particular axiomatization [4]. 

Quality criteria for ontology design [5] should be observed, chiefly: clarity, 
concerning the meaning of the defined terms; coherence, mainly between the 
textual definitions and the examples; and minimal ontological commitment, to 
allow the parties committed to the ontology to be free to specialize and instan- 
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tiate the ontology as needed. The last criterion is directly related to the axiom 
definition. 



4.3 Ontology Formalization 

The logical analysis of an universe of discourse is better performed when it is 
described using a formal language. In this language, in contrast to the natural 
language, we have signs that are unambiguous and formulations that are exact 
and, therefore, the clarity and correctness of a deduction can be tested with 
greater easiness and accuracy. A deduction in natural language often involves 
presuppositions which were not made explicitly, but which was taken for granted 
in the deduction process. 

In the formalization, we should establish a formalism to represent the on- 
tology knowledge categories. Once defined the representational formalism to be 
used, it is possible to fix the ontology terminology and, mainly, the semantics of 
its interpretation. It is important to stress that a formal ontology is not able to 
substitute the description of a conceptualization in natural language; rather it 
is to be used to support it or to be added to it, working as a device where some 
ideas are checked in relation to completeness and, perhaps, coherence. 

In short, the claim is to explicitly represent the conceptualization captured in 
the previous step in a formal language, what involves the commitment with some 
meta-ontology, the choice of a representation language and the development of 
the formal ontology. When it is not necessary any special commitment with 
a specific meta-ontology that proves itself to be adequate to the ontology in 
development, the first order logic tends to be the most adequate formalism, 
since it is the formalism that embeds less ontological commitments. 

To describe a first order formal language, it is necessary to specify the non 
logical symbols of its alphabet, that is, the constants^ denoting specific individ- 
uals of the universe of the discourse, the functional symbols^ denoting functions, 
and the predicates^ denoting properties of and relations between the individuals. 
In fact, when the first order logic is the formalism adopted, this must be the 
first step in the formalization phase: to map the ontology elements in constants, 
functions and predicates. After that, it is possible to create statements about 
the individuals in the domain, the formal axioms. 



4.4 Integrating Existing Ontologies 

During the capture and/or formalization processes, it could be necessary to 
integrate the current ontology with existing ones [3], in order to seize previously 
established conceptualizations. Indeed, it is a good practice to develop general 
modular ontologies, more widely reusable, and to integrate them, when necessary, 
to obtain the desired result. If many details are necessary, the ontology must 
incorporate only the essential ontological commitments and the others should 
be described in a micro-theory [7] or in a high level ontology. Thus, we preserve 
the quality criterion of minimal ontological commitment. 
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4.5 Ontology Evaluation 

Finally, the ontology must be evaluated to check whether it satisfies the specifi- 
cation requirements. Further, it should be evaluated in relation to some design 
quality criteria. The set of criteria proposed by Gruber [5] should be adopted 
both to guide the development and to evaluate the quality of the developed 
ontologies. 

We should notice that this step can, and in fact should, be performed jointly 
with the capture and formalization steps, in an iterative process. The use of a 
graphical model is very important when evaluating an ontology with domain 
experts. Furthermore, the competency questions play an essential role in the 
evaluation of the completeness of the ontology, specially when considering its 
axioms. In a particular domain it is possible to write down a great number 
of axioms and, therefore, we have to pay attention not to write down more 
axioms than the necessary. In this context we ought to have in mind the principle 
of minimal ontological commitment and in hands the competency questions. 
The set of axioms must be necessary and sufficient to express the competency 
questions and to characterize their solutions and nothing else [4]. Redundant 
axioms or those that do not contribute to the competence of the ontology must 
be excluded. 

4.6 Ontology Documentation 

All the ontology development must be documented, including purposes, require- 
ments and motivating scenarios, textual descriptions of the conceptualization, 
the formal ontology and the adopted design criteria. As the evaluation, the doc- 
umentation is a step that has to occur in parallel with the others. 

The terms captured in the domain conceptualization must be described in 
a Dictionary, considering two important principles: the auto-reference principle 
and the minimal vocabulary principle. The minimal vocabulary principle con- 
cerns the vocabulary used in the definition of the ontology terms. This vocabu- 
lary must be as small as possible and should not present any ambiguities. Any 
term without clear and unambiguous meaning must be defined as an entry in the 
Dictionary. The auto-reference principle says that the definition of terms in the 
Dictionary, when possible, should be done using terms that have already been 
described in it. Based in this principle, a potential approach to document an on- 
tology is using a hypertext, allowing browsing along term definitions, examples 
and its formalization, including the axioms. 

4.7 The Ontology Development Process 

The ontology development process should be viewed as a strongly iterative pro- 
cess rather than sequential steps. The capture step can point new requirements. 
During the evaluation, we can notice that the identified terms are not enough to 
the intended purpose of the ontology, forcing backwards motion to the capture 
step. Similar situations can occur in the formalization step: inconsistency can be 
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detected, causing a review of the specification and of the terms defined in the 
ontology. Finally, if integration of existing ontologies is necessary, it can have 
substantial impact in the definition and formalization of the terms. The steps of 
the ontology development process and their interdependencies are illustrated by 

Fig. 3. 




Formal 

Ontology 



Fig. 3. Steps in the ontology development and their interdependencies. 



The broken lines indicate that there is a constant interaction, albeit weaker, 
between the associated steps. The filled lines show the main work fiow in the 
ontology building process. The box involving the capture and formalization steps 
enhances the strong interaction, and consequently iteration, between these steps. 

Given the formal ontology, many times it is desired to make it operational. To 
do so, two other activities must be performed: design and coding. In the design 
step, concepts, relations and axioms of the formal ontology must be mapped to 
a format compatible with the chosen implementation language. In the coding 
step, the ontology is coded in the chosen language. 

5 A Study Case: A Software Process Ontology 

To illustrate the application of the proposed method, in this section we present 
part of the software process ontology developed to promote knowledge integra- 
tion in the TAB A Workstation [8], a software development meta-environment. 

The TAB A Project aims the construction of a configurable workstation for 
software development. Since the meta-environment and its instantiated envi- 
ronments need to handle knowledge about software development processes, this 
knowledge has to be shared along the Workstation as a whole. Thus, we used an 
ontology-based Knowledge Engineering approach to develop a modular knowl- 
edge base of software process to this environment. 

The software process ontology aims to support the acquisition, organization, 
reuse and sharing of software process knowledge in the TABA Workstation, as 
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shown in Fig. 4. Every knowledge based tool compromised with this ontology will 
share a common vocabulary, facilitating the communication between developers 
and, the most important, allowing the sharing and reuse of knowledge bases in 
the met a- environment as much as in the instantiated environments. 



Ontology of Software Process 




Fig. 4. Uses of the ontology in the TABA Workstation. 



Given the complexity of this domain, we adopted a leveled approach for 
developing ontologies. The software process ontology was developed on the top of 
central domain ontologies, namely, ontologies of activity, procedure and resource. 
For each central ontology, the proposed method was recursively applied. In the 
formalization step, we used the first order logic and established a formal first 
order language about software processes. 

Due to space limitations, it is not possible to present the entire ontology. Since 
the goal of this section is to show how to apply the proposed method, we decided 
to present only some aspects judged capable of illustrating its application. We 
presented parts of the activity ontology and of the software process ontology. 
The procedure and the resource ontologies are not presented. A more complete 
view of this ontology can be found in [9]. 

5.1 Activity Ontology 

The concept of activity is in the core of any software process model. Activities 
can occur in several levels, from an elementary task to a development process 
phase. An activity is the basic transformational action primitive that uses in- 
put artifacts to produce output artifacts, supported by resources. Basically, an 
activity ontology must be able to answer the following competency questions: 

1. Which is the nature of an activity? 

2. In which sub-activities is an activity decomposed? 

3. Which activities must antecede a given activity? 

4. Which artifacts are input to, or produced by a given activity? 
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5. Which resources are required by an activity to be performed? 

6. Which procedures can be adopted to perform an activity? 

Fig. 5 shows a partial model of the activity ontology, but does not span all its 
terms. In fact, there are several other concepts in this ontology. For instance, to 
capture the dependence between activities, we defined concepts of pre- activity 
and post-activity. The terms used were defined in a Dictionary. 



output 




J 




1 


Artifact 


J 



Management 
Activity 



Quality 

Assurance 

Activity 



Fig. 5. Part of the activity ontology. 



The axioms of the ontology were developed to provide a basic interpretation 
of the concepts in the ontology and to capture and write down the constraints 
imposed in the domain. The ontological axiom below, for instance, defines the 
concept of pre-activity from the input and output relations: an activity a\ is a 
pre- activity of an activity U 2 , if and only if, exists at least one artifact s that is 
an output of a\ and an input to a 2 - 

(Vai, U 2 ) (preactivity (ai, U 2 ) (3s)(input(s, U 2 ) A output(s, ai)) . (1) 

The predicate activity (a, t), denoting that the activity a is of the type t, was 
defined to formalize the existence of different types of activities. The parame- 
ter t can assume one of the following values: {Construction Act, Management Act, 
Quality Act, Certification Act, Test Act }, representing each of the types identified 
in the taxonomy. Further, the following epistemological axioms hold: 

(Va)(activity(a, CertificationAct) V activity (a. Test Act) ^ 

activity (a, QualityAct)) . (2) 

Since the first order logic is not a typed logic, it is necessary to define consoli- 
dation axioms establishing which types of objects can be used as an argument 
in a predicate. Thus, the following consolidation axiom, for instance, should be 
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observed, where the asterisk (*) indicates that the value of this argument does 
not matter: 

(Vai, a 2 ) (preactivity (ai, a 2 ) ^ activity(ai, *) A activity(a 2 , *)) . (3) 



5.2 Software Process Ontology 

Given the basic ontologies, the software process ontology was built from them. 
Fig. 6 shows part of the final model of this ontology. 




Fig. 6. Part of the software development process ontology. 



Many other axioms were proposed. In the procedure ontology, for instance, 
the following consolidation axiom were defined in order to capture that the 
relation “adoption” is constrained by the relation “possible adoption” . 

(Vp, a)(adopotion(p, a) ^ possibleadoption(p, a)) . (4) 



6 Related Works 

Uschold and King [3] proposed what they called “a skeletal methodology for 
building ontologies” , defining a small number of stages that they believed would 
be required for any future comprehensive methodology. In this sense, the method 
here proposed followed some of their guidelines and stretched it towards a more 
systematic approach for building ontologies. 
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In the TOVE {TOronto Virtual Enterprise) Project, Gruninger and Fox [4] 
proposed a method for building ontologies that presents some features that are 
very proper to its context, the enterprise modeling. In fact, it can be considered 
an applied approach and not a general one. Nevertheless, many guidelines sug- 
gested by this method are interesting, such as the use of competency questions 
to guide the development, and were incorporated in the proposal presented here. 

The leveled approach employed in this work can be viewed as an extension of 
the use of layered ontologies in CommonKADS [1]. In CommonKADS, ontologies 
are built in three layers, where each layer is formulated in terms of a lower-level, 
more widely applicable ontology. In the approach advocated here, there is not a 
pre-defined number of levels, nor rigid features for each level. The lowest level, as 
in CommonKADS, corresponds to the meta-ontology, but up to it, any number 
of ontology levels can be used. If generic ontologies are used, they must be placed 
in the lower levels. Core ontologies, basic for a wide domain, should be placed in 
the intermediate levels. In the higher levels, there must appear the application 
ontologies. 

7 Conclusion 

In order to develop KBSs with the desired quality and productivity, the knowl- 
edge acquisition must be conveyed for reuse. In this context, ontologies play 
an important role. In this paper we presented a method for building ontolo- 
gies and parts of the software process ontology developed using the proposed 
method. 
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Abstract. This article proposes a method of automatic diagnosis of the 
level of knowledge reached by the user. The technique that will be used 
for that consists in comparing the choices made by the user with those 
made by an expert system under the same conditions. This comparison 
will enable us to know by who - the system or the user - the most 
efficient choices have been made. The data thus gathered are of the 
utmost importance for both explanatory and tutoring systems. Through 
the GeneCom and Bateleur systems, a method aimed at guessing and 
evaluating the user's intentions will be presented. This method will 
enable the system to define the level of the user and as a consequence to 
know what has to be explained to him. It will also enable the system to 
find out its own shortcomings and to know what it has to learn. 



1 Introduction 

This article proposes a method to evaluate the level of the user through the running of 
two rules-based systems: Bateleur and GeneCom. These two systems have been im- 
plemented and tested with successfully in LAFORIA (University of PARIS 6). 
Bateleur is able to simulate the decisions made by a user. Bateleur has been applied to 
the game of tarot. GeneCom is a system which generates comments. It is able to elabo- 
rate explanations of the choices made by the user. GeneCom uses Bateleur as a data- 
base of the domain of application. In order to give explanations, GeneCom needs to 
analyse the decisions of the user and it also needs to guess if the user is an expert or a 
beginner. 

The problem of the representation of the user's knowledge is essential to produce a 
good diagnosis (and as a consequence: to produce good explanations [4] or com- 
ments). That is the reason why a good model of the user is primordial for the ex- 
planatory systems as well as for the tutoring systems [5]. Many researchers have taken 
an interest in this issue of user-modelling. Robert Kass [8] has made a list of the three 
types of modelling patterns concerning the knowledge of the learner in comparison 
with the knowledge of system: 

Helder Coelho (Ed.): IBERAMIA'98, ENAI 1484, pp. 361-372, 1998. 
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1. The cover model proposed by Carr and Goldtein [3] is the most simple among the 
three systems. It considers that the system has all the knowledge concerning the do- 
main and that the learner has only a part of this knowledge. It has also been used by 
Clancey [6] in GUIDON and NEOMYCIN. 



r 

V 



Knowledge of the domain and of the system 
Knowledge of the user | 



Fig. 1. Cover model. 

2. The differential model proposed in the West system [2] suggests that the user has a 
part of the knowledge of the system and that the system has only a part of the know- 
ledge of the domain. 



Knowledge of the domain 
f Knowledge of the system 



Knowledge of the user 



Fig. 2. Differential model. 

3. The third model is called disrupted. It is based on the cover model, but ”mal-rules” 
are added to the knowledge of the learner to model the mistakes of the user. This 
technique is used in systems such as Debuggy [1] and Proust [7]. 



r Knowledge of the domain 



Knowledge of the system 


Know; 


edge of the user 




Mals-Rules 



Fig. 3. Disrupted model. 



The complementary model(sQQ Fig. 4), proposed in our system, is a generalization of 
the cover model and of the differential system. Unlike previous models, we do not 
take for granted the superiority of the system on the user. We consider that the system 
may have to be run by various types of users, beginners as well as experts. The knowl- 
edge of the user may be more or less important than the knowledge of the system. 
Both the system and the user may also have their specific field of knowledge. In this 
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case, they both have, besides their common knowledge, some part of knowledge that 
the other one does not have. 



r Knowledge of the domain 



Knowledge of the system 




f ^ 

Common knowledge 






Knowledge of the user 
c J 





Fig. 4. Complementary model. 

This type of modelling is interesting since it makes possible an adaptation of the 

behaviour of the system according to the level of the user: 

• When the system finds out shortcomings in the knowledge of the user (see Fig. 5), it 
can generate explanations focused on the detected problem, in order to communicate 
its knowledge to the user. 

• When the system realizes that the user has a better knowledge of a particular 
problem (see Fig. 6), it can start a mechanism of self-tutoring in order to improve its 
own knowledge. 



Knowledge of the domain 



Knowledge of the system 



Knowledge of the user 



Fig. 5. Case of a beginner. 



Knowledge of the domain 



Knowledge of the user 



Knowledge of the system 



Fig. 6. Case of an expert-user. 



The notion of "viewpoint” developed by Etienne Wenger [10] is also ineluded in the 
funetioning of GeneCom. Indeed, the system makes a pattern of the knowledge of the 
user, taking as a basis its own knowledge of the domain, whieh is stoeked in Bateleur. 
Consequently, the behaviour of the user is analysed aeeording to the point of view of 
GeneCom. 



First, this article will present (in section 2) the three levels of decision-making used by 
Bateleur (the system whieh simulates the deeisions of a user in the game of tarot). 
GeneCom takes for granted that these levels model the proeess of deeision-making of 
the user. Then, section 3 will present the general funetioning of Bateleur. The 
knowledge of Bateleur is used by GeneCom in order to define the level of the user. 
Seetion 4 will explain how GeneCom sueeeeds in guessing the intentions of the user 
by exploiting the information about a real game. Then, seetion 5 will present the 
meehanism used to evaluate the intentions of the user. Seetion 6 will propose a 
method aimed at making a diagnosis of the level of knowledge reaehed by the user. 
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2 The Different Levels of Decision-Making 

We will limit ourselves to a reasoning proeess made of three levels of decision- 
making. Bateleur uses these three levels and GeneCom takes for granted that the user 
makes his choices through those same levels. The three levels of decision-making are 
the following ones: 

1 . The choice of a plan of game (a long-term view). 

2. The choice of one of the goals of this plan (a middle-term view). 

3. The choice of an action which aims at reaching the previously chosen goal (a short- 
term view). 

The notion of strategy enables us to pass from the level of the plan to that of the 
choice of the goal ; while the notion of tactics expresses the transition from the level 
of the choice of the goal to that of the choice of the action. 



2.1 Strategy: A Link between Plans and Goals 

Strategy enables us to choose the goals to be reached according to the plan that has 
been set. For instance, each time that Bateleur [NIGRO 93] plays a card, it must 
choose among the goals of its plan of game the one which is best suited to the 
situation. 

On Fig.7, the system has a plan made up 
of four goals. The strategy has concluded 
that goal 3 is to be reached rather than the 
three other goals of this plan. GeneCom 
considers that the list of goals are ordered 
by importance. The choice of the goal 
consists in deleting goals which can not 
be reached and to select the goal with the 
higher level of importance. 




Fig. 7. Transition from the plan to the goal. 



2.2 Tactics: A Link between Goals and Actions 

Tactics consists in finding one action to be made in order to reach the previously se- 
lected goal. If several actions have been found, GeneCom takes one to chance. 




Fig. 8. Transition from the goal to the action. 
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3 The Functioning of Bateleur 

With more than 400 rules in the game of tarot, Bateleur can be considered as a tarot- 
player of an average level. This system is based on rules of the first order, able to 
simulate the behaviour of one or several tarot-players. The simulation of a player's 
reasoning goes through the three levels of decision-making described in illustration 9. 




Fig. 9. Bateleur's different levels of deeision-making. 



In the example given in Fig.9, the plan selected by Bateleur is made up of four goals. 
For each trick, Bateleur chooses the goal which is the best suited to the situation (for 
instance "clearing the long suit"). Then the system selects the action which is the best 
suited to the chosen goal ( for instance the "King of Clubs"). This structure presents a 
progressive refining in the decision of the choice of the action. Its advantage is that it 
gives us a sign of execution at various levels of abstraction ; thus making it easier to 
generate explanations and to adapt the tutoring process (see section 6). 

We will note that Bateleur can change his plan during the session, but we will suppose 
in the article that only one plan can be considered for a better understanding. 



4 The GeneCom System Guesses the User’s Intentions 



4.1 Search of the Possible Goals 

This section will present the first stage reached by GeneCom. Considering the actions 
chosen by the user, GeneCom tries to guess the goals that the user wanted to reach. 
The system uses his meta-knowledge [9] to search in Bateleur's rule-database the 
different goals which may have triggered the rules that have activated the choice of the 
actions laid by the user. 

For instance, in Fig. 10, starting from the "Four of Spades" (fourth trick of the player), 
GeneCom searches in the goals used by Bateleur the goals triggering the rules con- 
cluding on the choice of this "Four of Spades": the player may have wanted to play his 
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losing cards ; he may have wanted to bluff a long suit or he may have wanted to play 
his singletons. 







Clearing the long suit 






2 of Clubs 




Clearing the long suit 




5 of Trump 




Playing the long suit to make the opponent trump 




Playing the losing eards 


4 of Spades 




1 ^ 


Playing the singletons 
Bluffing a long suit 







21 of Trump 



Fig. 10. Establishing the goals according to the cards. 



4.2 Search of the User’s Plan of Game 

This section shows how GeneCom searches for the user's plan by taking as a basis the 
selected goals. This stage consists in finding one of the Bateleur plans which would 
explain the user's line of action. The system manages to do that by eliminating all the 
plans which do not contain at least one logical goal in each action. 




Fig. 11. Transition from the goals to the plans of the user. 

In Fig. 11, plans 1 and 2 have to be cancelled since they do not explain trick 4 (4 of 
Spades) and trick 1 (10 of Clubs). GeneCom assesses that plan 3 of Bateleur is the one 
corresponding to the user's strategy. 



5 Evaluate the User’s Intentions 

This section will describe how GeneCom decides who - between the player and the 
Bateleur system - has made the best choice, whether at the level of the plan, of the 
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choice of the goal or the level of the ehoiee of an aetion. In order to separate the two 
possibilities, GeneCom uses Bateleur's database to simulate the consequences of both 
choices. In the case of tarot, it will simulate the actions of the four players until the 
end of the game. 

Then GeneCom compares the assessments of both variants, which corresponds, in the 
case of tarot, to the differenee in the number of points won by the two variants. 

In the example given in Fig. 12, the system is interested in the thirteenth triek of a 
game. Bateleur would have played the card Cb whereas the player put the card Cj. By 
simulating both possible ends of game, GeneCom finds out that the assessment made 
from the eard chosen by Bateleur is better than the assessment made from the player’s 
choice. 



Variant based on 


1 


Variant based on 


the player's card 




Bateleur's card 


1 




1 



South East Northwest 

13 

14 

15 

16 

17 

18 

Points won : 38 



Cj 


— 




-► 



















































































South East North West 

13 

14 

15 

16 

17 

18 

Points won : 45 



Cb 


— 























































































Fig. 12. The two variants based on two eards. 



Actually, three cases ean be observed: 

1. Bateleur's proposal is the best one: the comment states that the user’s choice is not 
interesting and that it would have been better to do as Bateleur did. 

2. The user’s proposal is the best one: the comment states that the user’s choice is good 
and that it would have been wrong to think of doing like Bateleur. In that case, the 
self-tutoring programme enables GeneCom to learn the user’s knowledge. 

3. Both proposals are equal: the comment states that the user’s choice is good but that 
it would have been possible to do like Bateleur. The self-tutoring programme can 
also be started. 
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6 A Method of Diagnosis Applied to the Game of Tarot 

This section will present a method (applied to the game of Tarot) enabling us to set 
the level of knowledge of a user against the level of Bateleur for the choice of a plan 
and the management of this plan. The diagnosis at the level of the choice of a goal or 
an action follows the same principle. 

In order to do that, it would be interesting to include the assessment of the real game 
in the analysis, instead of comparing the assessments of the two simulated variants 
(games played with the plan chosen by the player and with the plan chosen by 
Bateleur). 

Thus, by comparing the assessments of the two variants and of the real game, the 
system can estimate the level of the player set against the level of Bateleur on four 
main fields of knowledge: 

• Knowledge concerning the choice of a plan (approached in that section) 

• Knowledge concerning the management of a plan (approached in that section) 

• Knowledge concerning the choice of a particular goal for a given plan 

• Knowledge concerning the choice of a card in order to reach a given goal. 

In a first stage, the player plays a game against Bateleur. This game is called the "real 
game”. In a second stage, GeneCom analyses the sign of this "real game” and estab- 
lishes: 

• The plan chosen by the player at the beginning of the game 

• The goals chosen by the player throughout the game. 

Taking as a basis the plan chosen by the player, GeneCom asks Bateleur to simulate 
the game without the player. The differences in results will be used to establish who - 
between the player and Bateleur - has managed this plan better. Then GeneCom asks 
Bateleur which plan it would have chosen with the same dealing of cards. If the plans 
are different, GeneCom gives a simulation with the plan chosen by Bateleur (Bateleur 
plays against itself). 

The three games thus obtained lead to three assessments, which can be different in 
terms of points. These assessments are called: 

• Real-Assessment for the "real game”. 

• Player- Assessment for the game simulated and based on the player’s plan. 

• Bateleur- Assessment for the game simulated and based on Bateleur’s plan. 

By comparing the assessments of the two simulated variants and of the real game, the 
system can diagnose the gaps in the levels reached by the player and Bateleur on the 
various fields of knowledge used in decision-making during the game (in our case, the 




A Method to Diagnose the User's Level 



369 



choice and the management of the plan). The following diagram shows the three pos- 
sible developments of the game. 




Fig. 13. Three possible developments of the game. 



When the player’s plan is different from the plan advised by Bateleur , GeneCom 
calculates the two variants. The first one is based on the player's plan. The seeond one 
on Bateleur's plan. The assessments of these three ends of game will enable us to di- 
agnose the level of the player. 

If we take the right to use eomparisons of superiority and equality between the as- 
sessments, thirteen cases are possible which can be gathered in three categories: 

• The three assessments are different. 

• Two out of three assessments are equivalent. 

• The three assessments are equivalent. 

The analyse of these thirteen cases using the following notations: 

• The sign ”>" is used for ” is better than”. 

• The sign ”<" is used for ” is worse than”. 

• The sign ”=” is used for ” is equivalent to”. 

• Real-A is used for "the assessment of the real game”. 

• Player-A is used for "the assessment of the variant ealculated by Bateleur and 
based on the plan ehosen by the player”. 

• Bateleur-A is used for "the assessment of the variant calculated by Bateleur and 
based on the plan ehosen by Bateleur”. 
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The comparison of these three assessments allows to extract many important informa- 
tion about choices and levels of the user in comparison with Bateleur: 

• If {Bateleur-A > Real-A) or {Bateleur-A > Player-A) 
then the choice of Bateleur's plan is better than user's plan. 

• If {Bateleur-A < Real-A) or {Bateleur-A < Player-A) 
then the choice of user's plan is better than Bateleur's plan. 

• If {Bateleur-A = Real-A) and {Bateleur-A = Player-A) 

then the choice of Bateleur's plan and user's plan are equivalent. 

• If {Player-A > Real-A) then the plan is better managed by Bateleur than the user. 

• If {Player-A < Real-A) then the plan is better managed by the user than Bateleur. 

• If {Player-A = Real-A) then the user and Bateleur have the same level to manage the 
plan. 

When the user level is better than Bateleur, GeneCom tries to learn the user plan or 
the management. When the Bateleur level is better than the user's, GeneCom can gen- 
erate a comment which indicates that Bateleur is better and why it is better. 

For example, when Player-A > Real-A = Bateleur-A, Bateleur is a better manager of 
the player's plan that the player himself {Bateleur-A < Player-A) and the player's plan 
is better than Bateleur's plan {Player-A > Bateleur-A). GeneCom can learn that the 
player's plan is better than Bateleur's plan and it can explain that the player has chosen 
an excellent plan but has not been able to manage it. GeneCom presents Bateleur's 
management of this plan. 

An other example: when Player-A < Real-A < Bateleur-A, the player manages his plan 
better than Bateleur {Player-A < Real-A) and Bateleur's plan is better than the player's 
plan {Real-A < Bateleur-A). GeneCom can learn to manage the player's plan and it can 
explain the reasons of the necessity to choose Bateleur's plan. 



7 Limits 



7.1 Taking Other Sessions Into Account 

This method is based on the analysis of the consequences of a single session. The 
diagnosis established on the analysis of one session must be balanced by the diagnoses 
of many sessions by the same user. 

Example: The choice of an excellent plan cannot by itself prefigure the good level of 
the user. It must be confirmed by a regularity in the choice of good plans and in the 
good management of these plans. Similarly, the excellent assessment of a session 
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based on an unexpected plan must not hide the bad management of a session on a 
particularly favourable conditions. 



7.2 Pertinence of Solutions 

The user’s choices are judged and analysed in comparison with Bateleur knowledge (it 
is a relative reference). But, perhaps theses judgements and analyses are bad in the 
reality (with an absolute reference). The function of GeneCom is "like" an human: it 
judges, analyses, explains and learns in comparison with its own domain of know- 
ledge (here Bateleur's knowledge). 



7.3 Generalisation / Re-using 

This method can be generalised to other domains under certain conditions: 

• The domain of application must include a choice made by the user, liable to be criti- 
cised by the system. 

• Then the user must manage this choice in order to reach a final state which can be 
evaluated. 

• The domain must enable the system to develop a variant up to a terminal position, or 
at least up to a meaningful intermediary position liable to be compared to a position 
of the same order reached by the user. 

• The domain must acknowledge that two "terminal" or "intermediary" positions can 
be unambiguously evaluated or compared. 

Many domains can be quoted: strategic games, management actions portfolios, diag- 
nostic of breakdown or dysfunction (when the user must test many parameters to lo- 
cate the problem). 



8 Conclusion 

The method which is developed in this article can be applied to different levels of 
decision. The system will be able, for example, to consider that the user is very 
competent in a long-term planning but that he can not manage a short-term planning. 
The sixth section gives details on this method that enables us to make an automatic 
diagnosis of the level of knowledge of a human player as it is compared to that of 
Bateleur, which is taken as a reference. When the system considers itself better than 
the player on a particular field of knowledge, it will focus the comments it generates 
on this field of knowledge, in order to give the player, explanations targeted on his 
weaknesses. Taking the opposite method, when the system considers that the player is 
better than itself, this evaluation of self-diagnosis will give it the precise field of 
knowledge which must be improved. This diagnosis may possibly be used as a trigger 
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to a tutoring mechanism aimed at modifying the evaluation which has proved to be in 
the wrong. 
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Abstract. A neural net based methodology for phonetic classification 
with telephone speech in Spanish is described. Because of the high com- 
putational requirements and error rates obtained by using a unique Mul- 
tilayer Perceptron (MLP), a different approach is needed in order to 
improve the performance of the task. 

In the proposed approach, the basic set of Spanish phonemes is separated 
in groups according to articulation mode criteria and a Multilayer Per- 
ceptron (MLP) is trained for every phonetic group, along with a front-end 
MLP whose function is to distinguish between phonetic groups. 
Experiments were made with speakers from the telephone speech OGI 
corpus in order to tune the parameters of the MLPs, as well as to evaluate 
the performance of the proposed methodology under different represen- 
tations of the speech signal and modifying some parameters of the ANNs 
such as learning rate, topology and transfer functions. 

Results of the experiments are summarized and some remarks are passed. 
Both, results and remarks, are based on the analysis of the confusion 
matrixes obtained when the trained MLPs are used to classify speech 
used for training as well as speech data that the MLPs haven’t ’’seen”. 



1 Introduction 

The computational science department of the University of Valladolid is en- 
gaged in the development of an Automatic Recognizer of Continuous and Spon- 
taneous Speech (ARCSS) based in the connect ionist approach Artificial Neural 
Networks - Hidden MArkov Models (ANNs-HMMs), whose theoretical founda- 
tions have been formally settled in [1]. The whole ARCSS system is outlined in 
figure 1, where the frame in dotted lines shows the work reported in this paper. 

According to figure 1, the speech signal is transformed into a sequence of 
vectors of parameters in such a way that their representation is suitable for the 
whole task. These parameters are generally meaningful in the context of speech 
processing, as well as useful in reducing the amount of data. 

The authors wish to thank the Center for Spoken Language Understanding (CSLU) 
at the OGI for their kindness in providing the corpus for this worh. 
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The use of a MLP as a statistical estimator, instead of the conventional 
methods used in HMMs, has at least two advantages [2]. 

1 The conventional HMM method makes strong assumptions about the sta- 
tistical features of the input such as parameterizing the input densities as 
mixtures of gaussian densities with no correlation between features, or as the 
product of discrete densities for different features considered statistically in- 
dependent. These type of assumptions are not needed for a MLP estimator. 

2 ANNs are a good match to discriminative objetive functions so that the prob- 
abilities are optimized to maximize discrimination between sound classes, 
rather than to closely match the distributions within each class. 




Developed work 



Fig. 1. Scheme of the whole ARCSS system in development 



The hybrid HMM/MLP statistical estimator is represented in figure 2, where 
at every step n the acoustic vector as well as its context, are presented as 
inputs to the MLP. Local probabilities are generated by the ANN, which are 
used, after division by priors, as local scaled likelihoods in the Viterbi algorithm. 

2 Motivation 

Since the methodology outlined in section 1 has been used during the last years, 
we decided to use it as part of the ARCSS system in development. Therefore, 
an analysis of the Spanish part of the telephone speech OGI corpus was made in 
order to set the feasibility of the above mentioned methodology. 

The telephone speech OGI corpus is a multilanguage corpus recorded at the 
Oregon Graduate Institute of science and technology [3]; it is composed of record- 
ings acquired through the telephone line with people speaking in 22 different 
languages. The main features of the Spanish part of the corpus are: 

1 108 hispanic speakers (74 men and 34 women). 

2 Telephone speech recorded at 8000 samples per second with speakers from 
Mexico, Spain, Cuba, etc.; there are also many ”chicano” speakers. 

3 Contents per speaker: 
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Fig. 2. Hybrid scheme for p{xn\qk) estimation 



a) Fixed words and phrases: 24 sec (3 sec, 3 sec, 8 sec and 10 sec). 

b) Short descriptions: 42 sec (10 sec, 10 sec, 12 sec and 10 sec). 

c) Samples of continuous elicited free speech: 50 sec. 

Some parts of the Spanish OGI corpus are labelled; in particular, the 50 
seconds of continuous elicited free speech are labelled with a set of 204 labels 
taken from the Worldbet label set [3]. However, there are only a few dozens 
of basic labels in the corpus, which were expanded to 204 in order to account 
for especific features in some phonetic context of the speech; the expansion was 
made by adding diacritic information to the basic set of labels shown in table 1. 



Table 1. Label set with diminished diacritics 



NUMBER 


LABEL 


NUMBER 


LABEL 


NUMBER 


LABEL 


NUMBER 


LABEL 


1 


a 


16 


E 


31 


w 


46 


.unk 


2 


e 


17 


.br 


32 


f 


47 


.ns 


3 


o 


18 


P 


33 


I 


48 


? 


4 


s 


19 


D 


34 


be 


49 


.In 


5 


.pau 


20 


pc 


35 


nj 


50 


al 


6 


i 


21 


V 


36 


ts 


51 


.epi 


7 


n[ 


22 


d[ 


37 




52 


.nitl 


8 


t[ 


23 


3 


38 


g 


53 


dZ 


9 


t[c 


24 


hs 


39 


L 


54 


?c 


10 


r( 


25 


d[c 


40 


tSc 


55 


U 


11 


m 


26 


.Is 


41 


.bn 


56 


s 


12 


1 


27 


b 


42 


j 


57 


T 


13 


k 


28 


X 


43 


& 


58 


h 


14 


kc 


29 


G 


44 


r 


59 




15 


u 


30 


N 


45 


gc 


60 


0 
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Although table 1 contains 60 labels, they can be converted to 55 by using 
the equivalences shown in table 2. It can be seen by analyzing the table of 
substitutions that labels 4, 5 and 6 are replaced with labels already contained in 
table 1 , whereas labels 1 , 2 and 3 are replaced with a single new label not included 
in table 1. These replacements were made according to the documentation [3]. 



Table 2. Substitutions for undefined labels 



Number 


Label 


Equivalence 


Number 


Label 


Equivalence 


1 


7* 


•glot 


4 


U 


u 


2 


? 


•glot 


5 


h 


X 


3 


?C 


•glot 


6 


0 


o 



Therefore, according to section 1, a single MLP was used as a phonetic clas- 
sifier for the 55 phones listed in tables 1 and 2. Thereby, a speaker that was 
considered suitable for the experiments was chosen out from the corpus; several 
characteristics were taken into account during the selection of the speaker: 

a) Their speech contains samples of all of the labels shown in tables 1 and 2. 

b) Low level of background noise during recording. 

c) Intelligibility. 

Several experiments were made in order to tune the parameters of the MLP 
according to similar systems reported in the literature. The particular parame- 
ters applied as patterns to ” feed” the MLP were the Perceptual Linear Predictive 
coefficients (PLP) [4], which have been widely used for speech processing. 

It must be mentioned that in all of the experiments reported in this paper 
the speech parameterization was made at a rate of 10 miliseconds with frames 
spanning 25 miliseconds; namely, patterns for training and testing were fed to 
the MLP every 10 ms, with patterns representing 25 ms of a particular phone. 

According to literature (v.g., [2]), a suitable topology for the MLP as phonetic 
classifier should contain between 500 and 4000 units in the hidden layer of a 
three layer system. Thereby, several experiments with MLPs whose hidden layers 
contained 1024 and 2048 neurons were carried out, as well as some others with 
smaller systems that did not work out at all. 

All of the experiments reported in this article were made by supervised learn- 
ing, where a desired output pattern was provided for each class so that the corre- 
sponding output neuron was settled to 1, and the rest of them to 0. Classification 
accuracy was measured by picking the MLP output with the highest value and 
making a comparison between the associated and the actual class. 

Figure 3 shows the results obtained by classifying the same corpus used for 
training. Figure 3(a), in particular, shows the results obtained with a 12x1024x55 
MLP, where the weights were updated every 100 iterations; the learning rate 
was settled to 0.01 and the inertia factor to 0.8. It can be seen that the correct 
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percentage is barely 25 % and the correct curve has an irregular behaviour, since 
it rises and falls abruptly as training advances. 

Figure 3(b) shows the results for an experiment developed under the same 
conditions above mentioned except for a change in the topology of the MLP 
to 12x2048x55. The only improvement observed is that the ANN learns faster 
compared to figure 3(a); however, the peak correct percentage is hardly 25 %. 





(a) (b) 





(c) 



(d) 



Fig. 3. Correct percentage in the classification of the corpus used for training 



In figure 3(c) the same conditions of the experiment reported in figure 3(b), 
except for an increase in the learning rate to 0.1, were settled. Two improvements 
can be observed in this experiment: the MLP has learned slightly faster and the 
correct percentage increases up to almost 35 %. 

Finally, in figure 3(d) the learning rate was settled to 0.01 and the topology 
was changed to 36x1024x55. At this time, we wanted to analyze the behaviour 
of the MLP by introducing context in the input patterns. Thereby, the ANN was 
fed with a contextual pattern spanning three single neighbour patterns. 

It can be seen in this last figure that the correct percentage fairly increases 
as the training goes by. However, although the behaviour seems to be the ex- 
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pected, since the accuracy never falls abruptly, it is worse than in the rest of the 
experiments shown in figure 3, since the peak accuracy is only about 12 %. This 
experiment was interrupted due to computational expenses. 

Several conclusions can be obtained from these experiments: 

a) The correct percentages observed in all of the experiments are fairly low. 
Thereby, an approach to increase the classification efficiency is needed. 

b) It seems necessary to use more than 2048 neurons in the hidden layer of the 
MLP in order to obtain better results than the ones observed. 

c) Since the use of context has been reiteratively mentioned not only in the 
context of speech processing by ANNs, but also in another methodologies 
like Hidden Markov Models (HMMs) and Dynamic Time Warping (DTW), 
it must be considered in the design of the classification system. 

d) A learning rate of 0.1 seems to work fine under the conditions of the exper- 
iments that we have carried out. 

It should be noticed that conclusions b) and c) imply the use of MLPs with 
thousands of neurons and tens of thousands of weights, which are expensive to 
train. This fact can be verified in the scientific literature, where a mention is made 
that training of such ANNs is carried out in powerful parallel supercomputers [5]. 

On the other hand, we can not interpolate directly the results shown in 
the literature due to the fact that we are using a very different corpus (since 
we are interested in the Spanish language), whereas the most of the reported 
experiments using MLPs as phonetic classifiers have been carried out in another 
languages (v.g., see [1] for german and english). 

3 The Proposed Methodology 

The main aim of the proposed methodology is that of designing a smaller system 
whose computational requirements are lower than those of the above mentioned. 
Therefore, we propose a hierarchically organized scheme for phonetic classifica- 
tion; such a system is shown in figure 4. In the proposed methodology, a front-end 
MLP classifies input patterns according to their articulation mode. It should be 
noticed that articulatory features are a very common mean of categorical clas- 
sification and analysis in linguistics [6] [7]. 

Once the front-end MLP has been trained to distinguish between articulatory 
classes, input patterns are introduced for classification. As well as in the previous 
experiments, the class associated to the output with the highest value is chosen 
as the category that the input pattern belongs to. 

Next, and once we ’’know” at some extent the category of the input pattern, 
the especialized MLP associated to the recognized category receives the input in 
order to accomplish the final classification. These especialized MLPs have been 
trained previously to discriminate between the phonemes inside each particular 
class provided by the front-end MLP. This scheme is closely related to Hierarchi- 
cal Mixture of Experts (HME) , a methodology that has been proposed recently 
as an approach to the principle divide- and-conquer [8]. 
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It should be noticed that the scheme of figure 4 associates the input patterns 
labelled with one of the 55 non-diacritics labels of tables 1 and 2 to 26 phonetic 
classes. This reationship is shown in table 3 and is based mainly in the charac- 
terization described by Fuentes [7]. The label /ai/ is included as a phoneme for 
consistency with the documentation provided along with the corpus (v.g., [3]). 




Fig. 4. Hierarchical scheme for phonetic classification 



4 Experiments and Results 

The experiments were carried out with several speakers of the OGI corpus by 
using four kinds of parameters as patterns: Linear Prediction Coefficents (LPC), 
Cepstrum^ Perceptual Linear Prediction (PLP) coefficients and Mel cepstrum. 

Furthermore, we experimented with the inclusion of parameters such as en- 
ergy and Zero Crossing Rate (ZCR) at the frame level; both of them normalized 
to the range [0,1]. The use of contextual information was also observed during 
the development of the experiments. 

4.1 The Front-End Classifier 

The front-end classifier has a topology with 100 hidden units and 10 output 
classes; this arrangement was chosen after after several setup experiments. Learn- 
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ing rate and inertia were fixed to 0.2 and 0.8 respectively; the weights of the MLP 
were updated every 100 examples. Although the learning rate could seem high 
at a first sight, it was good enough to guarantee a fast learning as shown next. 



Table 3. Worldbet to Spanish symbol association 



Category 


Spanish symbols 


Worldbet symbols 


Unvoiced occlusive 


P 


P pc 


t 


t[ t[c 


k 


k kc 


Voiced occlusive 


b 


Vb be 


d 


D d[ d[c 


g 


G g gc 


Unvoiced fricative 


f 


f 


z 


T 


s 


s hs S 


j 


X h 


Voiced fricative 


y 


j 


Affricate 


ch 


tS tSc 


Lateral 


1 


1 


11 


L dZ 


Vibrant 


rr 


r 


r 


r( 


Nasal 


m 


m 


n 


n[ N 


n 


nj 


Vowel 


a 


a 3 V @ 


e 


e E 


i 


i I 


o 


o 0 


u 


u w U 


ai 


al 


Non- speech 




.pan .br .Is .glot .bn .unk .ns .In .epi .nitl 



Figure 5 shows some typical curves obtained by testing and monitoring the 
progress of the learning procedure in the front-end MLP. Figure 5(a), in particu- 
lar, shows the correct percentage obtained when the training corpus is classified 
as function of the number of examples used for learning, whereas the error rate 
as function of the same parameter is shown in figure 5(b). 



In order to provide a few parameters to evaluate the performance of the 
trained MLP, two measurements were obtained from the confusion matrixes: the 
average and the maximum correct percentage across training. These measure- 
ments were made with both training and test corpus for a particular speaker. 
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The results for all of the experiments accomplished during training and test of 
the front-end MLP for a particular speaker are shown in tables 4 and 5. 





(a) Correct percentage. (b) Error rate. 

Fig. 5. Typical curves for a single MLP experimented 



The correct classification percentages of tables 4 and 5 are shown for the four 
kinds of speech parameters used as input patterns. When size is settled to 14 it 
is refered as twelve basic parameters plus energy and ZCR; the rest of the rows 
include the results by using context of one, two and three neighbour frames. 

It can be seen from table 4 that the results obtained are very irregular from 
one kind of speech parameters to another (v.g., from LPC to cepstrum). This 
situation is due to the great differences in dimensionality between one parameter 
and another, since energy at the frame level is closely related to either the 
cepstrum or the Mel cepstrum^ whereas for LPC and PLP the parameter related 
to energy (v.g., the gain G of the predictor) is not included in the input pattern. 

The experiments reported in table 5 were made in order to evaluate the 
performance of the classifier when the differences in the dimension of the input 
parameters are reduced. Thereby, the inputs to the MLP were normalized to the 
range [-1,1] prior to its application to the classifier for both training and test. 

Furthermore, since it is well known that MLPs composed of traditional sig- 
moid functions with output range [0,1] in the hidden layer waste most of the 
early training in biasing its weights to the mean activation value, which is differ- 
ent from zero, we changed the transfer functions of this layer to the symmetric 
sigmoid with output range [-1,1] and mean value 0 [5]. 



4.2 The Intra-class Classifiers 

A whole set of charts and tables like those shown in section 4.1 has been ob- 
tained for all but voiced fricative, affricate and non-speech classes at the output 
of the front-end MLP by training/testing the class-especific MLP for the final 
classification. However, on one hand, it is not necessary to use MLP for voiced 
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Table 4. Correct percentages for the front-end MLP whit no normalization 





TRAINING CORPUS 


SIZE 


LPC 


Cepstrum 


PLP 


Mel cepstrum 




AVG 


MAX 


AVG 


MAX 


AVG 


MAX 


AVG 


MAX 


14 


94.43 % 


98.46 % 


10.57 % 


28.42 % 


89.46 % 


95.62 % 


17.25 % 


60.07 % 


42 


95.46 % 


99.59 % 


12.40 % 


34.41 % 


90.14 % 


96.55 % 


23.95 % 


60.48 % 


70 


97.87 % 


99.96 % 


14.45 % 


36.28 % 


94.07 % 


98.74 % 


17.59 % 


49.01 % 


98 


98.72 % 


100.00 % 


11.67 % 


29.59 % 


95.80 % 


99.39 % 


13.99 % 


44.99 % 




TEST CORPUS 


SIZE 


LPC 


Cepstrum 


PLP 


Mel cepstrum 




AVG 


MAX 


AVG 


MAX 


AVG 


MAX 


AVG 


MAX 


14 


57.47 % 


64.84 % 


10.42 % 


25.14 % 


66.24 % 


69.22 % 


16.80 % 


52.68 % 


42 


67.93 % 


73.68 % 


12.54 % 


35.44 % 


75.38 % 


79.20 % 


22.98 % 


59.37 % 


70 


74.05 % 


77.66 % 


14.69 % 


39.50 % 


80.33 % 


84.06 % 


16.90 % 


49.35 % 


98 


77.84 % 


80.17 % 


11.77 % 


32.20 % 


85.28 % 


87.75 % 


13.92 % 


49.51 % 



Table 5. Correct percentages for the front-end MLP whit normalization 





TRAINING CORPUS 


SIZE 


LPC 


Cepstrum 


PLP 


Mel cepstrum 




AVG 


MAX 


AVG 


MAX 


AVG 


MAX 


AVG 


MAX 


14 


87.10 % 


94.33 % 


89.18 % 


95.50 % 


88.91 % 


96.03 % 


87.95 % 


95.78 % 


42 


97.04 % 


99.80 % 


97.85 % 


99.96 % 


97.45 % 


99.92 % 


97.07 % 


99.76 % 


70 


98.59 % 


100.00 % 


99.03 % 


100.00 % 


99.03 % 


100.00 % 


98.82 % 


100.00 % 


98 


99.17 % 


100.00 % 


99.48 % 


100.00 % 


99.45 % 


100.00 % 


99.30 % 


100.00 % 




TEST CORPUS 


SIZE 


LPC 


Cepstrum 


PLP 


Mel cepstrum 




AVG 


MAX 


AVG 


MAX 


AVG 


MAX 


AVG 


MAX 


14 


60.17 % 


63.42 % 


54.46 % 


62.45 % 


68.12 % 


73.48 % 


63.86 % 


66.95 % 


42 


66.95 % 


72.51 % 


63.64 % 


68.09 % 


70.19 % 


75.34 % 


70.86 % 


74.41 % 


70 


71.58 % 


76.28 % 


71.03 % 


75.43 % 


76.93 % 


81.06 % 


76.37 % 


78.79 % 


98 


76.95 % 


80.37 % 


76.33 % 


80.21 % 


83.39 % 


85.16 % 


76.48 % 


82.20 % 



fricative and affricate phones since there is only one item in each class so that 
no further classification is needed. On the other hand, we are not interested at 
this time in recognizing different classes of non-speech patterns. 

Therefore, in order to summarize the results of the experiments, a single table 
containing the final results in phonetic classification for a particular speaker is 
included instead of a full set of charts and tables for every speaker. Table 6 
shows the results for the best case of the PLP with three frames of context and 
no normalization (v.g., 98 inputs to the MLP). 

It was found that in general the PLP parameters provide the best perfor- 
mance accross the whole set of experiments. However, there are especific cases 
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where better results were obtained with another kind of parameters. The analysis 
and discusion of this cases is out of the scope of this paper. 



Table 6. Results of the whole classification task for both the training and test 
corpus 





CLASS CORRECT 




PHONEME CORRECT 


NEAT ACCURACY 


CATEGORY 


TRAIN 


TEST 


s 


TRAIN 


TEST 


TRAIN 


TEST 


U. occlusive 


99.18 % 


92.51 % 


p 


100.00 % 


71.43 % 


99.18 % 


66.08 % 


t 


96.88 % 


89.61 % 


96.09 % 


82.90 % 


k 


95.65 % 


84.42 % 


94.87 % 


78.10 % 


V. occlusive 


99.13 % 


86.10 % 


b 


97.50 % 


100.00 % 


96.65 % 


86.10 % 


d 


100.00 % 


97.44 % 


99.13 % 


83.90 % 


g 


100.00 % 


100.00 % 


99.13 % 


86.10 % 


U. fricative 


98.43 % 


89.03 % 


f 


100.00 % 


100.00 % 


98.43 % 


89.03 % 


z 










s 


100.00 % 


98.51 % 


98.43 % 


87.70 % 


j 


100.00 % 


100.00 % 


98.43 % 


89.03 % 


V. fricative 


100.00 % 


100.00 % 


y 


100.00 % 


100.00 % 


100.00 % 


100.00 % 


Affricate 


100.00 % 


100.00 % 


ch 


100.00 % 


100.00 % 


100.00 % 


100.00 % 


Lateral 


98.43 % 


90.36 % 


1 


100.00 % 


100.00 % 


98.43 % 


90.36 % 


11 


100.00 % 


100.00 % 


98.43 % 


90.36 % 


Vibrant 


100.00 % 


70.49 % 


rr 










r 






100.00 % 


70.49 % 


Nasal 


98.00 % 


89.84 % 


m 


100.00 % 


98.41 % 


98.00 % 


88.41 % 


n 


98.75 % 


98.80 % 


96.78 % 


88.76 % 


h 


100.00 % 


97.62 % 


98.00 % 


87.70 % 


Vowel 


85.12 % 


82.05 % 


a 


98.19 % 


91.91 % 


83.58 % 


75.41 % 


e 


94.37 % 


85.47 % 


80.33 % 


70.13 % 


i 


100.00 % 


94.12 % 


85.12 % 


77.22 % 


o 


98.70 % 


90.36 % 


84.01 % 


74.14 % 


u 


100.00 % 


94.57 % 


85.12 % 


77.59 % 


ai 











It should be mentioned that a topology IxO^xO was used for all of the 
experiments; I and O stand for the number of inputs and output classes re- 
spectively. Thereby, the number of hidden units was settled to the square of the 
output classes. Besides, since there were fewer training examples for the intra- 
class MLPs compared to the front MLP, the learning rate was changed to 0.1 
and the weights were updated every 10 examples instead of 100 as done before. 

There are some empty cells in table 6; when this situation occurs it should 
be understood that there were not examples of the particular phone to be used 
during the experiments. The last two columns of this table indicate the neat 
accuracy of the classification task for every particular phoneme; this is obtained 
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by multiplying the percentages of accuracy for the earlier columns of training 
and test respectively. 



5 Conclusions 

A comprehensive methodology for the phonetic classification of speech signals 
in Spanish has been described; such a methodology is based in the principle of 
divide-and-conquer by suggesting the the use of especialized MLPs as class and 
intra-class classifiers. 

Although the use of several classifiers by phonetic categories is fairly more 
complex than a single phonetic classifier, it should be noticed that by summing 
up the number of hidden units of the especialized MLPs, and keeping the topol- 
ogy IxO^xO above mentioned, no more than 200 hidden units are needed for 
the implementation of a hierarchical classifier for the set of phones of table 3, 
compared to the number of hidden units suggested for the single MLP [2]. 

On one hand, the use of the proposed methodology has the advantages of a 
faster training, due to the small number of neurons of the whole system, and, on 
the other hand, more control is gained on the experiments, since it is possible 
to set bounds for local observations in every single classifier. 

It can be said, regarding the local observations, that small MLPs (v.g., two 
or three output classes) need very much less training than the bigger ones, and 
their performance is higher, as shown in table 6. On the contrary, large MLPs 
need very much training and their performance is lower than the smaller ones. 

In some phonetic categories the PLP is not the best choice for representation, 
whereas with other ones the performance is insensitive or even slightly degraded 
by the use of context. These observations, that have not been analyzed here, 
could be useful in the design of a phonetic classifier with heterogeneous inputs 
(v.g., different representations of the input patterns at the speech frame level 
according to the classification made by the front-end MLP). 
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Abstract: Dominant Point Deteetion (DPD) is one of the tasks in image 
analysis; it aims making polygonal approximations through the seareh 
of a set of points of relevance in a contour, reducing the amount of 
information. In this work, the ability of neural networks to learn the 
performance of several DPD algorithms is studied. For it a dynamic 
neural net that traverses the contour will be used, giving a relevance 
measurement for each point and detecting them through a simple post- 
processing phase. Different training sets and net configurations were 
used. The results of applying the neural algorithm to images of real 
objects show its validity, and also the ability of neural nets to learn 
previously unknown DPD algorithms. 



1. Introduction 

Polygonal approximations are often used as a data reduction procedure in digital 
curves. One of the main ways to perform this task is through Dominant Point 
Detection (DPD), that is the search for those points of highest significance in the 
curve, ideally preserving its morphology and reducing the amount of points in it. In 
[San96] the ability of artificial neural networks (ANNs) for learning the behaviour of 
a DPD algorithm [M093] was proved. In that work was shown that a Time Delay 
Neural Network (TDNN) [Her9I] was able to traverse a contour and detect dominant 
points with an 85% of coincidence with respect to the output of the algorithm, after a 
training session in which the net was informed about the points detected by the 
original algorithm. 

It would be very interesting to show the ability of the nets to learn these kind of 
algorithms and to generalise this concept, in order to simulate the behaviour of other 
unknown algorithms. If this is possible any DPD algorithm that is suitable for a given 
task could be learnt by the net and executed in time no dependent of the complexity of 
the designed algorithm. Moreover, if the net is implemented in hardware, the 
important reduction of time would permit its inclusion in real-time image processing 
systems. 
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This work tries to generalise our former results obtained with a single DPD 
algorithm and look for a network arehiteeture able to provide the best results 
regardless of the algorithm eonsidered. For it, we have seleeted five algorithms: 
Rosenfeld-Johnston (R-J) [RJ73], Rosenfeld-Weszka (R-W) [RW75], Freeman-Davis 
(F-D) [FD77], Teh-Chin (T-Ch) [TCh89], and Non-Colinear Dominant Points 
(NCDP) [Ine98]. Those algorithms have been also extended to deal with open curves, 
in order to increase the cardinality of our training set, as explained below. 



1.1 Dominant Point Detection in Open Curves 

A number of algorithms (e.g. skeletonization) provide open curves from their 
output, so not only closed curves belonging to object contours are suitable to apply 
DPD algorithms on them. In order to use both open and closed contours for our 
training set, we have devised a way to apply the formerly cited algorithms (originally 
for being applied only in closed ones) to open curves, but keeping the nature of the 
algorithms invariant. 

The algorithms studied need the determination of a support region for each point of 
the curve, in which the local discrete curvature is assessed. Sometimes a fixed number 
of points is considered, while for other algorithms it varies depending on the local 
features of each region. Among the former is the Freeman-Davis algorithm and the 
Teh-Chin and the NCDP for the latter. Rosenfeld-Johnston and Rosenfeld-Weszka, 
although also set variable support regions are set, they need a fixed number of points 
to be analysed before and after the point. 

With open contours is impossible to satisfy the existence of the support region for 
the first and last points, at least. A possible solution for all those points in which the 
supporting region can not be determined is to artificially expand the curve keeping the 
smoothness in the first and last points. For this, a reflection of the curve with respect 
to extreme points is proposed with a number of points as needed by each algorithm. 
This way, the support region is always computable for all the points in the original 
contour, and the algorithms will work without any change in their procedure. 
Obviously, the first and last points are always labelled as dominant. 

It is important to know whether both, the algorithms and the nets, are computing 
good approximations. As a measurement of the quality of the point selection, the 
Optimisation Error (Eq) [Ine98] will be used. This number gives an integrated 
evaluation of the approximation to the quality of the curve and the information 

reduction rate. It is defined as Eq = E^ n^ I , where N is the length in points of the 

N 

curve; is the number of dominant points, and E^ ^ ^ mean squared 

i=l 

error, being e, the distance from the /-th contour point to the segment of the polygonal 
approximation that joins the dominant points before and after that point. 
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2. Neural Detection of Dominant Points 



For DPD, an architecture based on dynamic neural nets of the kind TDNN [Her91, 
Ca96] was chosen. It can be shown that at each time t these nets behave like 
multilayer perceptrons, and therefore all the learning proeedures for perceptrons are 
applieable for their weight adjustment [Sta92, Thi95, Loo96]. In this case, the 
dynamic aspect of the net is the traversing of the curve, and the temporal parameter is 
the position of the net on the curve as it is moving through it. 

The support region is the zone of the curve used to calculate a local measurement 
of the discrete curvature. This measurement will be used to discriminate dominant 
points. For each contour point, those ^'-neighbours before and after it define a pattern 
in the training set. The multilayer net has a first layer of 2^'+! neurons to input the 
values of the chain-codes for the support region of the point under study, a hidden 
layer with a number of neurons to be determined, and a single neuron in the output 
layer. The signal at the input neurons will be a modified Freeman chain-code 
representation of the support region, as it will be described in detail below, and the net 
will be trained to output a measure of the curvature at the central point of that region. 
A simple procedure will be applied to this value to determine whether the central 
point of the pattern can be considered as dominant. 

Let Qi be the input value to the /-th input unit of the network. Since our curves are 
represented through 8-neighbour chain-codes (to the /-th point Pi arrives a segment 
whose direetion is coded as/ 0 {0,1, ... ,7}), a ehange to this codifieation is needed to 
keep Qi in the interval [0,1]. To achieve a better generalisation of the curvature by the 
network, a codification independent of the contour orientation is desired. The new 
eodifieation proposed for is computed as shown in Table 1. 



Table 1. Freeman chain-code transformation. 



5. =/-M -f 

if 8/ > 4 then 8, = 8 - 8, 
if 8/ < -4 then 8, = 8 + 8, 



8, 


Qi 


-4 


0 


-3 


0.125 


-2 


0.25 


-1 


0.375 


0 


0.5 


1 


0.675 


2 


0.75 


3 


0.875 


4 


1 



Target outputs have to be provided for training the network. For this, a PDF 
algorithm is applied to the curves to obtain the dominant points. Lets consider that we 
get M dominant points {dpi,...,dpM}, where dpj g (1,...,A^ (/ g {1,...,M}) indexes the 
order in the eontour of the dominant point y, being the size of the eurve. As shown 
in [San96], the learning of the net is simplified if each dominant point is considered 
not just as a single point but as a “cloud” of influence, spreading itself to its vicinity. 
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For a point /, let dmm,i = min {di^dpw.., di^dpu) be the distance to the closest dominant 
point. Each distance d^dpj can be computed as the norm of the vector that joins the co- 
ordinates (x,;;) of the point i and the dominant dpf, or to simplifying the calculations, 
computed as the difference between the positions of i and dpj. 

The spreading of the points can be done using a number of approximations; for 
example, through a triangular signal with a vertex at the dominant point: 




^ if d mini ^ V + 1 

v + 1 

otherwise 



; where ti is the target output presented to the net during 



the training process for the /-th point of the curve, and v is the maximum distance 
allowed in the analysis of the point. Other possibility is the use of a gaussian function 

CJ 

centred at the dominant point, ft ~ e ? where a is a parameter whose value 
depends on the desired level of spreading. In the performed analysis, no significant 
difference has been noticed, so one can choose one or the other based on efficiency 
considerations. 



Algorithm for preparing the training set: 

For each point of the digital curves in the training set: 

1. Apply the DPD algorithm to be learnt, finding the set of dominant 

points detected by it. 

2. Compute the target output in each curve point, according to the 

selected spreading function. 

3. Apply the translation to the input code, obtaining signal A = {a\, . . ., } . 

4. For each value, a, , make a training pattern taking {ais,....pi, } as 

input and ti as output 

Once the training set is ready, the net is trained by the backpropagation algorithm 
[Tre95], using the error function E = V (ti~Oif ; where Np is the number of 

^NpU 

patterns in the training set, ti is the signal of the output neuron at the point pi, and Of is 
the actual signal of the output neuron when the net is centred at the point pp 
The application of the DPD neural algorithm can be described as follows: 

Algorithm of neural DPD: 

1. Translate the Freeman chain-code of a curve F = } to the net 

input signal A = { ^i, ^ 2 , ... , } as explained above, being N the 

number of points in the curve. 

2. Activate the net for each code segment of the chain {aus, a^s+x, .... ap 

ai+s} getting the output signal B = {bp b„} as an estimation of the 
local curvature at each point provided by the net. 

3. Apply a curvature threshold Kq to the signal B obtaining a new signal C 

= {ci, ...,c„} in the following manner: 

IF bi < Ko THEN c, = 0 ELSE c, = bt 
This signal is considered as formed by segments where Ci> 0 being separated 
by gaps where = 0. 
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4. Calculate the centroid position Cj for each segment of the curve, and 
mark the closest point in the curve to that position as a dominant 

point. For segment j, the centroid is computed as ^.= 
where inij and endj are the first and last positions of the segment. 

The overall system performance is displayed in Fig. 1 in a schematic way. The 
three main parts: pre-processing, neural analysis, and post-processing can be 
identified. The items to be set are the net architecture and the threshold value. We aim 
to find a configuration able to learn any DPD algorithm based on the computation of 
local significance measures. 




CURVE - 




-> F {fi} 4 2 3 0 0 6 8 . ■ ■ 

^ (first differences) 
A {at] 0.25 0.675 0.125 0 . . . 





^ (centroid calculation) 

oppositions ► POLIGONAL APPROXIMATION 



Fig. 1. Steps involved in the neural dominant point detection. 



3. Results 

3.1 Analysis of Open Curves 

To test the efficiency of the proposed solution for detecting dominant points in 
open curves, closed contours of real objects were taken and 5 contiguous points were 
deleted, opening the curve. For each one, in closed and open version, every DPD 
algorithm considered was run. Then, the optimisation error was computed for each 
case, considering only the same portion in the closed contours that was used in the 
open ones, and including the extremes as dominant points in the latter. Results are 
shown in Table 2. 



Table 2. Mean optimization errors for the open and closed 
contours, and the percentage of coincidence. 



Algorithm 


Eo (open) 


Eo (closed) 


% coincidence 


R-J 


0.1167 


0.1337 


91.5 


RAN 


0.1133 


0.1259 


93.5 


F-D 


0.1032 


0.0822 


95.5 


T-Ch 


0.0532 


0.0603 


97.1 


NCDP 


0.0532 


0.0603 


97.2 
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For all the algorithms tested, more than 90% of the dominant points found in the 
open curves matched the ones found in the corresponding closed ones, the 
optimisation error being similar for both cases, showing the validity of the approach 
solution. 



3.2 Neural DPD 

A set of 30 real images of natural objects, containing 490 contours and 39354 
points was used. The set was divided into two groups: one for training and another for 
validation; in such a way that the first one contains roughly twice the number of 
curves than the second one. Finally, the training set had 315 contours and 26435 
points, and the validation set had 175 contours and 12919 points. 

In Fig. 2 the behavior of the net after learning one of the algorithms is shown. It 
can be seen how the net produces an output close to that of the algorithm. Different 
behaviors of other algorithms also cause different behaviors in the net output. 





Fig. 2. Left: performance of the Teh-Chin DPD algorithm on a closed contour; Right: outcome 
of the net on the same contour after learning the behaviour of that algorithm. 



The final aim is to study which net architecture behaves the best for the task of 
learning these algorithms. For each DPD algorithm a fixed network architecture will 
adopt a different set of weights. The generalisation experiment is not about the 
behaviour of a particular net, but on the parameters that are heuristically set in its 
architecture (number of neuron in the hidden layer, etc.). 

Most of the DPD algorithms need some parameters to be conveniently adjusted; so, 
a study to select the best input parameters for each algorithm was made previously to 
the network training. The criterion to adjust these parameters was to obtain the 
minimum mean optimisation error for the training set. The following set of 
parameters was selected: 
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- Rosenfeld-Johnston: m = n !\0 

- Rosenfeld-Weszka: m = n HO 

- Freeman-Davis: m= l,s = 5 

- Teh-Chin: using ^-curvature (non parametric) 

- NCDP: using ^-curvature, = 0.15 

The number of neurons in the input layer is usually set according to the nature of 
the problem. Here this number matches the size of the support region that depends on 
each algorithm. Thus, the number of input neurons will be also a parameter to be 
introduced in the study. To analyse the performance with different sizes of the support 
region, a study of all of the considered DPD algorithms was done varying their 
parameters and region sizes, and the suitable values obtained were between 5 and 13 
input neurons. These results are shown in table 3. 

Table 3. Widths of the support region for each considered algorithm. 



Algorithm 


Mean support region width 


Optimisation error 


R-J 


10.2 


0.2806 


R-W 


10.2 


0.3034 


F-D 


5.0 


0.1008 


T-Ch 


5.9 


0.0611 


NCDP 


5.9 


0.0625 



The research on the number of neurons for the hidden layer was based on previous 
experiences [San96]. The number of hidden neurons was varied from 3 to 11. 
According to all of these considerations, 15 topologies were used, as shown in Table 
4 (I: no. of neurons in the input layer; H: in the hidden layer; O: in the output layer): 



Table 4. Configurations considered for the number of neurons for each layer. 



1 H 0 


13 11 1 


13 9 1 


13 7 1 


13 5 1 


13 3 1 


11 9 1 


11 


7 1 


11_5_1 


11_3_1 


9_7_1 


9_5_1 


9_3_1 


7_5_1 


7_3_1 


5_ 


_3_1 



With the 5 considered algorithms and the 15 configurations, 75 training sets were 
set up. Using the classic Backpropagation algorithm, with a learning factor of p = 
0.35 and a momentum of p. = 0, the nets converged in 75 epochs in average, with a 
mean square error between 0.01 and 0.06. To evaluate the generalisation capability of 
the nets, each one, trained with a different algorithm, was applied to the validation set, 
and the results were compared to the direct application of the corresponding algorithm 
on those curves. 



3.3 Learning 

One parameter that affects the error (Eo) obtained by the net is the curvature 
threshold, Kq, chosen in step 3 of the neural DPD algorithm. Nine values of Kq were 
tested, (0.3,0.4,0.45,0.5,0.6,0.65,0.7,0.75,0.8), and different analyses were carried out. 
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In Fig. 3 the graphs for Eo for the validation set are displayed against Kq. Each 
graph corresponds to a different algorithm, and their behaviour for the 5 topologies 
with minimum is shown. As it can be observed, all the algorithms, but Freeman- 
Davis, reach the minimum E^ between 0.6 and 0.75. The best topology for each 
algorithm was different, as shown in Table 5, so a privileged architecture can not be 
considered a priori to face the second part of the study. 



Rosenfeld-Johnston 




• 


7-9-1 


e 


13-3-1 


— ^ 


11-9-1 


-- 


13-11-1 


— -■ 


5-9-1 



Rosenfeld-Weszka Freeman-Davies 





Teh-Chin Non-Collinear D.P. 





Fig. 3. Behaviour of the 5 studied algorithms for different eonfigurations of the net when the 
output threshold is varied. 
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Table 5. Best resulting topologies from the study 
for each algorithm. 



Algorithm 


Topology 


Threshold ko 


R-J 


5- 3- 1 


0.60 


R-W 


9- 7- 1 


0.75 


F-D 


7- 5- 1 


0.45 


T-Ch 


11-3-1 


0.70 


NCDP 


13-3-1 


0.70 



3.4 Looking for a Generalised Architecture 

In order to select the best topology, taking into account the mean optimisation error 
for the algorithms, the multiple-mean comparison test was applied. This statistical test 
is suitable for populations that follow a normal distribution. The Central Limit 
Theorem shows that any population with a high number of individuals (more than 30) 
approaches that distribution [Her75], so we may assume that our 15 populations 
(topologies) of 45 individuals each one (5 algorithms x 9 thresholds) follow a normal 
distribution. 

We want to show that if a given configuration is able to learn n algorithms, it will 
be able to learn one more (^+1), and we aim to look for the topology that performs the 
best in this task. For this, the following cross-validation test will be applied. Each 
time, 4 out of the 5 five algorithms analysed were taken, and at each time, the 
following steps were carried out: 

1. Each of the 15 topologies in Table 4 was trained with the 4 selected algorithms 

separately. 

2. Each of those 15x4 neural networks was applied to the validation set, varying 

the values for the threshold to determine the topology and the value of Kq that 
provide the best performance for the 4 algorithms. 

3. In each of those topology-threshold settings the global optimisation error was 

computed for the validation set, and for each of the 4 algorithms. Finally, the 
mean optimisation error was computed for each setting. 

4. The multiple-mean comparison test was applied to the values of the mean 

optimisation errors. In each comparison, the Barttlet test was applied to 
determine whether the variances were equal or not, and they resulted equal 
for all cases. 

5. Finally, the net whose mean optimisation error was the best for the algorithm 

left apart, was applied on the validation set using the best threshold. 



The results obtained in the cross-validation test are shown in Table 6. 
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Table 6. Results for cross-validation. 



Algorithms used for 
choose the best 
architecture 


Best 

Topology ; 
Threshold 


Algorithm 

for 

validation 


Eo obtained using 
the validation 
algorithm 


Eo for NN after 
learning the 
validation 
algorithm 


R-J, R-W, F-D, T-Ch 


9-7-1; 0.75 


NCDP 


0.06 


0.07 


R-W, F-D, T-Ch, NCDP 


9-7-1; 0.75 


R-J 


0.31 


0.27 


F-D, T-Ch, NCDP, R-J 


11-9-1; 0.70 


R-W 


0.30 


0.36 


T-Ch, NCDP, R-J, R-W 


13-3-1; 0.75 


F-D 


0.17 


0.25 


NCDP, R-J, R-W, F-D 


9-7-1; 0.75 


T-Ch 


0.06 


0.08 



In Table 6, the first eolumn shows the algorithms that were seleeted for learning 
and the second one shows the topology and threshold that gave the best performance 
after learning those algorithms. Then, the algorithm left apart to test the learning 
capability of the selected net is indicated. As shown in the two rightmost columns, the 
values for the optimisation error obtained 



by the algorithm and by the net on the 
same set of curves, indicate that in all 
cases the net was able to learn that new 
algorithm, because the mean optimisation 
error obtained was similar by both 
methods. 

Finally, the multiple-mean comparison 
test was applied to find the setting 
topology - threshold able to learning the 
best the 5 studied algorithms, resulting 
topology 9-7-1 and threshold Kq = 0.75. 
The error curves obtained for algorithms 
with this net configuration are shown in 
Fig. 4. 




Fig. 4. Error curves for the best net 
configuration for every studied algorithm. 



4. Conclusions 

In this work, a feasibility and generalisation study on the ability of multilayer 
artificial neural networks to learn the performance of dominant point algorithms has 
been presented. These algorithms are usually utilised as a way to build polygonal 
approximations of digital curves through the selection of local curvature maxima. A 
way to apply those algorithms to open curves has been also presented. 

From the analysis of the results of the cross-validation experiment, it can be stated 
that a multilayer perceptron with the topology 9-7-1, followed by a threshold Kq = 
0.75, is able to learn a vast variety of DPD algorithms, at least if they are based on a 
high curvature points. Thus, it can be concluded that the ANNs are able to perform 
this task successfully. 
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In the future other DPD algorithms and ANN paradigms will be incorporated to the 
study, seeking for more powerful architectures, or faster learning algorithms. The 
consecution of a reliable architecture to learn different algorithms will allow us to 
build polygonal approximations of digital curves in a time independent of the 
complexity of the algorithm simulated by the network. This fact also opens the door 
to the parallelisation of any DPD algorithm, regardless of its suitability for being 
parallelised, or to the hardware implementation of any detection procedure for real- 
time applications. 



Acknowledgements 

This work has been partially funded by Fundacion Bancaja under project 

PI A96-18 and Generalitat Valenciana under project GV97-TI-05-26. 

References 

[Can96] Cancelle, R. & R. Gemello. "Efficient training of Time Delay Neural 
Networks for sequential patterns". Neurocomputing, vol. 10(1), pp. 33- 
42, 1996. 

[FD77] Freeman, H. & L.S. Davis. "A comer-finding algorithm for chain- 
coded curves". IEEE Trans. Comput., vol. C-26, pp. 297-303, 1977. 

[Her75] Hernandez, L.H. & A. Del Castillo. "Probabilidades", CUJAE, Ciudad 
de la Habana, 1975. 

[Her91] Hertz, J., A. Krog, & R.G. Palmer. ''Introduction to the Theory of 
Neural Computation". Addison- Wesley, 1991. 

[Ine98] Inesta, J.M., Buendia, M. & Sart, M.A. "Reliable Polygonal 
Approximations of Imaged Real Objects Through Dominant Point 
Detection". Pattern Recognition, vol. 31(6), pp. 685-699, 1998. 

[Loo96] Looney, G. "Stabilization and Speedup of Convergence in Training 
Feedforward Neural Networks". Neurocomputing, vol. 10(1), pp. 7-31, 
1996. 

[M093] Melen T & T. Ozanian. "A fast algorithm for dominant point detection 
on chain-coded contours". In Proc. of the Int. Conf. on the Computer 
Analysis of Image and Patterns. Budapest, Sept. 1993. Lecture Notes 
on Computer Science, Springer- Verlag. 

[RJ73] Rosenfeld A & E. Johnston. "Angle detection on digital curves", IEEE 
Trans. Comput., vol. C-22: 875-878, 1993. 

[RW75] Rosenfeld A & J.S. Wezska. "An improved method of angle detection 
on digital curves". IEEE Trans. Comput., vol. C-24: 940-941, 1975. 

[San96] Sanchiz, J.M.; J.M. Inesta & F. Pla. "A Neural Network Algorithm to 
Detect Dominant Points from the Chain-code of a Closed Contour". In 




396 Aurora Pons et al. 



[Sta92] 

[TCh89] 

[Thi95] 

[Tre95] 



Proc. of 13^^ International Conference on Pattern Recognition; ICPR 
'96, Vienna, Austria. IEEE Computer Soeiety Press, vol. IV, pp. 325- 
329. 1996. 

Starita, A. & A. Sperduti. "Speedup Learning and Network 
Optimization with Extended Baek Propagation". Teehnieal Report. 
October 1992. EFniversita Degli Studi Di Pisa. Dipartamento di 
Informatica. 

Teh, C. & R.T. Chin. "On the Detection of Dominant Points on Digital 
Curves". IEEE Transactions on Pattern Analysis and Machine 
Intelligence, vol. 2(8), pp. 859-872, 1989. 

Thimm, G. & E. Fiesler. "Neural Network Initialization". Proceedings 
of International Workshop on Artificial Neural Networks. Malaga- 
Torremolinos, Spain, June 1995. pp. 535-542. 

Trejo, L.A. & C. Sandoval. "Improving Back-Propagation: Epsilon- 
Back-Propagation". Proceedings of International Workshop on 
Artificial Neural Networks. Malaga-Torremolinos, Spain, June 1995. 
pp. 427-432. 




Defeasible Constraint Solving over the Booleans 

Pedro Barahona 

Departamento de Informatica, Universidade Nova de Lisboa 
2825 Monte da Caparica, Portugal 
email: pb@di.fct.unl.pt 

Abstract. This paper extends a constraint solver over the booleans to make 
it defeasible, and embeddable in a general architecture for defeasible 
constraint solving. This complements previous work on defeasible solvers over 
finite domains and rational numbers. Similar to the latter, one approach uses 
witness variables to detect minimal conflict sets of constraints, but adds 
important overhead. Other approaches use data dependencies, as in finite 
domains, to detect conflict sets. Although these are not minimal, such 
approaches seem more promising due to their less complexity. 



1. Introduction 

Defeasible eonstraint solvers are suitable for over-eonstrained problems [JaFM96], 
namely model-based diagnostie applications, where defeasible constraints model 
faulty components, and planning/scheduling applications in which defeasible 
constraints represent preferences that may be left unsatisfied if necessary. 

Defeating constraints is an important addition to a constraint solver in that a) 
upon being told a constraint which makes the constraint store inconsistent, it returns 
information on which constraints are effectively responsible for the inconsistency; and 
b) it enables the incremental defeat of a constraint, removing it from the constraint 
store and avoiding, as much as possible, to reset other constraints. 

A general architecture to implement a Hierarchical Constraint Logic 
Programming instance ([BMMW89]) was presented in [HoMB96], with two main 
components, the Defeasible Constraint Solver and a Hierarchy Manager that tells and 
defeats constraints to the solver. The architecture is exemplified with linear 
constraints over the rationals, and extends work previously done with defeasible 
constraints on finite domains [MeBa93, MeBa95, MeBa96]. In this paper we 
complement such previous work with a study on defeasible boolean constraints. 

Although boolean constraints are often more efficiently handled through finite 
domains techniques [CoDi94] in one solution applications, boolean solvers are still 
important (and offered by some systems, e.g. SICStus [SICS96]), namely when all 
solutions are sought, as in decision support systems, where a user may be interested in 
comparing the available solutions. By using incomplete reduction techniques, finite 
domain solvers rely on variable enumeration, and defeating a constraint eventually 
restart all such enumeration. 

This paper studies the extension of a classical boolean solver to a defeasible 
setting and is organised as follows. In section 2 the original boolean solver is briefly 
discussed. Section 3 addresses extends the solver by adding witness variables. Section 
4 presents an alternative approach, based on data dependencies, which is improved in 
section 5. Finally section 6 presents the main conclusions. 
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2. Boolean Constraint Solving 

Among other boolean constraint solvers [Benh93, Rauz93, MSSA93], the boolean 
unification algorithm developed in [BuSi87] is based on the theory of boolean rings, 
with operators + (exclusive disjunction) and • (conjunction). Function solve, below, 
produces, when possible, a most general unifier (m.g.u.) S of expressions B1 and B2, 
by solving B1+B2 (in this paper solving a constraint means unifying it with 0). 

function solve(in C: constraint; out S: Substitution): boolean; 
case C=l: solve := false 
case C=0: solve := true; S := {} 

case C = A*v+B: % v is an arbitrary variable 

if solve ((1+A)*B, Sv) then 

V := apply(Sv, (1+A)*v'+B) % v' is a fresh variable 

solve := true; S = Sv U {v/V} 
else solve := false 
end if 
end case 
end function 



Solve succeeds if C is solvable, in which case it returns mgu S. The first two cases are 
obvious. In the third case, C is rewritten as A*v+B (where v is an arbitrary variable, 
not appearing in neither A nor B) and the rationale of the function is the following: 
since C cannot be solved iff A=0 and B=l, it must be imposed A=1 or B=0, i.e. 
(1+A)*B=0. This is done simply by solving an expression with one less variable, 
which justifies the recursive function call. Substitution v/(l+A)v'+B is the m.g.u. 
since if A=1 then it guarantees v = B (and A*v+B=0); if A=0 (and B=l), then v takes 
no role in solving the constraint, and is replaced by a new variable v’. 

An m.g.u. is composed with previous substitutions in the usual way, and such 
composition implements a “constraint store”. During program execution, boolean 
constraints are told to a constraint store (that maintains the composition of all the 
substitutions) and if the store is incompatible with a new constraint then its telling 
fails. Telling constraint C to a constraint store CSi can be defined as follows: 

function tell(in C : constraint ; CSiiSubst; out CSo : Subst) : boolean; 

Cl := apply (CSi , C) , tell:= solve(Cl, S) , compose(CSi, S, 

CSout) 

end function 



Example 1. Tell constraints {a+b, bed, de+1} to an empty constraint store. 

Telling the first constraint results in constraint store CSl={a/b}. Telling the 
second constraint to CSI results in substitution {b/(l+c d)b'} which composed with 
CSI becomes CS2={a/(l+c d)b’, b/(l+c d)b'}. The third constraint yields {d/1, e/1} 
which composed with CS2 results in CS3 = (a/(l+c)b’, b/(l+c)b’, d/1, e/1} 

Example 2. The set of constraints {a+b, b c d, d e+1, 1+a c} is unsatisfiable. 

Applying the previous constraint store CS3 to constraint 1+ac results in 
l+(l+c)b' c, which is simplified to 1, and fails trivially to be solved. 

In CLP no effort is made to identify the constraints that caused the failure. Once a 
failure is detected, the usual backtracking mechanism is triggered to exploit 
alternatives [DHSA88, SICS96]. A defeasible system though, should identify the 
responsible constraints, and relax one of them with least computational effort. 
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Definition 1. A conflict set is a set of constraints that are not satisfiable. 

Definition 2. A conflict set is minimal, if all its strict subsets are satisfiable. 

In case of a failure, a Defeasible Constraint Solver ideally returns the set of all 
minimal conflict sets, so has to prevent the Hierarchy Manager from telling sets of 
constraints that include minimal conflict sets. However, this is not always practical. 

For linear constraints over the rationals, the constraint solver keeps a solved 
form of the constraint store and it is easy to interpret the solved forms to identify 
minimal conflict sets [HoMB96, KoHo96]. In constraint solvers that use domain 
reduction techniques (e.g. for finite domains), conflict sets are identified by data 
dependency techniques [MeBa93, CoDR96], which do not ensure their minimality. 

3. DCS through Witness Variables 

This section introduces a defeasible constraint solver that relies on witness variables, 
thus resembling defeasible solvers for linear constraints over the rationals. 

Definition 3. For any constraint Ci, there is an extended constraint Ei=Wi*Ci, where Wi, 
the witness variable of the constraint, does not appear in any other constraints. 

For example, constraint ab+bc+1 is extended to Wi(ab+bc+l). The purpose of such 
witness variables is threefold. Firstly, if a set of constraints is satisfiable then the 
extended constraints is satisfiable and no conditions are imposed on the witness 
variables. Secondly, when solving the extended set of constraints all minimal conflict 
sets are identified. Thirdly, a constraint is defeated by simply setting its witness 
variable to 0. These points are justified by the following theorem. 

Theorem 1: A set of constraints {Ci, ..., Cn} is satisfiable, iff the corresponding set of 
extended constraints {wi*Ci,..., Wn*Cn} is solved by a substitution S that does not 
impose any constraint on the witness variables Wi. 

Proof. => If the set of constraints is satisfiable then there is a substitution S, which 
does not include any of the witness variables and solves all the constraints Ci (i.e. 
equates them to 0). Since the extended constraints are the conjunction of a witness 
variable Wi and a constraint Ci, the same substitution S, with no constraints on the 
witness variables, also solves the set of extended constraints. 

<= If the set of constraints is not satisfiable then the extended constraints 
may only be satisfied if (at least) one of the witness variables is set to 0. Hence, any 
substitution that solves the set of extended constraints includes one pair Wi/Ei . 

3.1 Checking Consistency 

This theorem suggests a change in the boolean unification algorithm that makes it 
defeasible. Instead of selecting arbitrarily a variable to be solved upon, the selection 
should only consider variables that are not witnesses. If this is impossible (only 
witness variables appear in the constraint to solve), then selecting a witness variables 
would force its substitution, which means, by Theorem 1, that the original set of 
constraints cannot be solved. This is done by the following adaptation of the previous 
function solve: 
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function solve_l(in C: const; out S: Subst, D: const) : boolean; 
case C=0: solve_l := true; S := {} 

case C = A*v+B: % where v g {wl, w2 , 

. . . , wn} 

if solve_l ((1+A)*B, Sv, D) then 
solve_l := true, 

V := apply (Sv, (1+A)*v'+B), 

S = Sv U {v/V} . 
else solve_l := false, 
endi f 

else % C only contains witness variables 

solve_l= false; D := C 
end case 
end function 



Telling an extended constraint is now done with the following function 



function tell_l(in C : const ; CSi : Subst ; out CSo:Subst, D:const): 
bool ; 

Cl : = apply (CSi, C) , 
if solve_l(Cl, S, D) then 

CSout := compose(CSi, S) , tell_l := true 
else tell_l := false 
endif 

end function 



Example 3. Tell constraints {ab+bc+1, ab+ac+1, a+b+c+l}to an empty CS. 

In a non-defeasible setting, telling the first constraint to an empty substitution 
returns constraint store CSI = {a/ c+1, b/1}. Telling the second constraint to CSI 
results in constraint store CS2 = {a/1, b/1, c/0}. When telling the third constraint, CS2 
is applied to the constraint that becomes 1 and can not be solved. In this defeasible 
setting, telling the first extended constraint Wi (ab+bc+1) to the empty constraint store 
returns ECSl = {a/ Wia’+WiO , b/ Wib'+Wi} (for convenience, for any variable v, v 
denotes 1+v, i.e. logical ->v). Telling the second extended constraint to ECSl returns 
ECS2 = {a/wiW 2 a’+WiW 2 e’+W 2 , b/wib’+wi, c/w 2 e’+WiW 2 b'}. Telling the third 
constraint fails returning expression D = Ws(wi+W 2 + W 1 W 2 ). 

3.2 Detecting Causes of Failure 

This example shows that the expression on witness variables returned by function 
tell_l identifies all minimal conflict sets in the original constraints. More formally. 

Theorem 2: If a set of constraints C= {Ci, ..., Cn-i) is satisfiable but C-C u(Cn} is 
unsatisfiable, then a) tell_I succeeds for all the extended constraints Wi*Ci where i<n; 
and b) tell_l fails when the extended constraint Wn*Cn is subsequently told, returning 
an expression D that encodes all the conflict sets in the set C’. 

Proof. Since a) is a direct consequence of Theorem 1, only b) needs to be addressed. 
Theorem 1 guarantees that an expression D on the witness variables alone is obtained 
by solve_l when constraint Wn*Cn is told. Hence, satisfying the set of extended 
constraints requires that D is solved (i.e. equated to 0). Since the original set of 
constraints is unsatisfiable, satisfying the set of extended constraints requires that at 
least one of the witnesses is equated to 0. Hence, solving D identifies all the 
combinations of witnesses variables that must be set to 0, i.e. all the conflict sets. 
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Example 3b. Expression D = W3(wi+W2+WiW2) evaluates to 0 either when W3 is set to 
0 or when wi and W2 are both set to 0. Therefore the minimal confliet sets in this 
example are {wi,W3} and {W2,W3}. Sinee W3 belongs to all (minimal) eonflict sets, 
setting it to 0 solves all eonflicts. Otherwise, both Wi and W2 must be set to zero. 

For eonvenience, expression D should be eonverted in a form that allows easy 
identification of the minimal conflict sets. This form is identified in the following 

Corollary. In the conditions of theorem 2, expression D returned upon telling 
constraint Cn is equivalent to Wn(l+(l+Wai...Waa)(l+Wbi...Wbp)...(l+Wd...Wcy)), where 
{Cai...Caa,Cn}, { Cbi...Cbp,Cn }...{ Cd,...Ccy,Cn } are the minimal conflict sets of C’. 

Proof. This follows from the properties of boolean rings. Since Cn is a member of all 
conflict sets, setting Wnto 0 solves all conflicts. Moreover, no other constraint belongs 
to all conflict sets, otherwise the set C ={Ci ...Cn-i} would not be satisfiable. Thirdly, 
if Wn is 1 , then expression D is evaluated to 0 iff one variable in each of the products 
Wii...Wn is set to 0. Finally, all the conflict sets in the expression are minimal, since for 
each non minimal conflict set {Cai, ...Caa,Cn,Cx} represented by expression (1+ 
WaiWa2 ...WaaWx) there is a minimal conflict set {Cai...Caa,Cn} of the form (1+ WaiWa2 
...Waa) and (1+x a) (1+a) = (1+a). 

Example 3c. Expression D = W3(wi+W2+WiW2), from Example 3b, is equivalent to 
W3(l+(1+Wi)(l+W2)) that identifies all minimal conflict sets {wi,W3} and {W2,W3}. 



3.3 Defeating a Constraint 

In this approach, defeating a constraint is trivial. All that is required is to set its 
witness variable to zero and make the appropriate simplifications. 

Example 3d. Defeat constraint wl(a b+bc+I) from ECS2 as defined in example 3. 

Since ECS2 = {a/wiW2a’+WiW2e’+W2, b/wib’+wi, c/w2e’+WiW2b’}, composing it 
with {wi/0} returns the extended substitution ECSx = {a/ W2a’+W2 , b/ b’ , c/ 
W2e’+W2b’}, which, after elimination of the renaming pair b/ b’, is the substitution 
obtained by telling the second constraint W2(ab+ac+l) to an empty constraint store. 



3.4 Discussion 

Although the introduction of witness variables allows the detection of all minimal 
conflict sets, it does so by increasing the number of variables. This is of course a 
problem as checking satisfaction of boolean constraints is an NP-complete problem, 
whose complexity grows exponentially with the number of variables. 

Efficient encoding (as those in [Brya86] used to speed up the boolean constraint 
solver of SICStus system) might improve the efficiency of a Defeasible Constraint 
Solver, but only a moderate number of witness variables can be used in practice. This 
limits the type of applications that can be addressed to those that naturally have a 
small number of defeasible constraints, or those in which the constraints can be 
organised in a small number of sets where either all or none are considered. 
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4. DCS through Data Dependencies 

An alternative approach to adding witness variables is the use of data dependencies 
to identify conflict sets. This is the technique adopted in defeasible constraint solvers 
based on reduction techniques (e.g. finite domains), namely [MeBa93, CoDR96]. 
This section introduces a technique appropriate to the boolean domain. 

4.1 Checking Consistency 

Definition 4. For any constraint C, there is a labelled constraint L-C where L, the 
label of the constraint, is a unique identifier. 

For convenience, positive integers are used as labels in this paper. In order to keep 
track of the data dependencies, labelled pairs are defined as follows: 

Definition 5. For every pair v/V in a substitution, there is a labelled pair L-Z-v/V 
where L is the label of the constraint that originated the pair and Z is the set of labels 
of all the constraints (excluding L) that were used to obtain V. 

Definition 6. A labelled substitution is a set of labelled pairs. 

Function tell_2, below, tells a labelled constraint into a labelled substitution 

function tell_2(in L-C: labelled_const ; LCSi : labelled_subs t ; 

out LCSo: labelled_subst , X: setof labels) : boolean; 
Ll-Zl-Cl := apply_2 (LCSi, L-C) , 
if solve_2(Cl, LCSi, S) then 
LS := add_label (Ll-Zl , S) , 

tell_2 := true, LCSo := compose_2 (LCSi , LS) , 
else tell_2 := false, X := {Ll}uZl 
endif 

end function 



Function apply_2 collects the labels of all the pairs used in the transformation of C 
into Cl (if V occurs in constraint C and LCSi contains the labelled pair Lv-Zv-v/V, 
then the set {LvjuZv is a subset of {Ll}uZl). Function solve_2 is similar to function 
solve of section 2 since the labels are ignored. Function add_label simply adds the 
label to all the pairs in the returned substitution S. Finally, compose_2 composes the 
two labelled substitutions avoiding repetitions on the labels. This can be illustrated 
with the constraints of Example 3. 

Example 4. Tell labelled constraints {1 - ab+bc+1, 2 - ab+a c+1, 3 - a+b+c+1} to an 
empty labelled constraint store. 

Telling the first labelled constraint returns LCSI = {l-{}-a/c+l, l-{}- b/1}. 
Telling constraint 2-ab+ac+l to LCSI first applies LCSI to the constraint, which 
results in 2-{l}-c, since variables a and b, whose pairs originated in constraint 1, are 
replaced in this constraint. Then, solving constraint c simply returns substitution c/0, 
to which the label 2-{l}is added resulting in {2-{l}-c/0}. Composing it with LCSI 
results in the new constraint store LCS2 = {l-{2}-a/l, l-{}- b/1, 2-{l}-c/0}. When 
constraint 3 -a+b+c+1 is told to the constraint store, LCS2 is applied to the constraint, 
which becomes 3-{l,2}-l. This constraint cannot be solved and the set {1,2,3} = 
{3}u(l,2} is returned by function tell_2. 
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4.2 Detecting Causes of Failure 

Example 4 illustrates the role of the set of labels returned by funetion solve_2. If 
telling a eonstraint fails, it identifies the original eonstraints involved in the failure. 

Proposition 1. The set of labels X returned by funetion solve_2 in ease of failure, 
identifies a eonfliet set eomposed of the eorresponding eonstraints. 

Data dependeneies do not guarantee that the identified eonfliet sets are minimal. As 
shown above, the minimal eonfliet sets for Examples 3 and 4 are {1,3} and {2,3}. In 
general, minimal eonfliet sets must be obtained by multiple interaetions between the 
Defeasible Constraint Solver and the Hierarchy Manager. In the example above, if 
constraint 1 is defeated, the DCS would still return a conflict set {2,3}. Then the HM 
would still need to ascertain that constraint 3 is satisfiable (by defeating constraint 2) 
in order to guarantee that the conflict set {2,3} is minimal. A similar interaction 
would identify the other minimal conflict set {1,3}. 



4.3. Defeating a Constraint 

In contrast to the approach using witness variables, defeating a constraint L is not 
trivial. Not only the pairs originated by telling L must be removed from the store but 
also the consequences of this removal must be propagated to the other pairs as done 
by function defeat_2 below. 



function defeat_2(in L: label; 

LCS : = remove (LCSi , L) , RS 
Reset := RS, 
while RS <> {} do 
RS := RS \ {R}, 

LCS := remove(LCS, R) , 

RS := RS \ {R} U RI, 
end while 
for L in Reset do 

C := constraint (L) 
if tell_2(L-C, LCS, LCSx) 
end for 
LCS := LCSx 
end function 



LCSi: labelled_) : labelled_subst ; 
:= dependent (LCS, L) , 

% R is an arbitrary element of RS 
RI := dependent (LCS , R) , 

Reset := Reset u RI , 

% original constraint with label L 
then LCS : = LCSx 



Function defeat_2 firstly removes from the constraint store all the pairs originated by 
constraint L (i.e. of the form L-_- v/V). Then it removes from the remaining labelled 
pairs those depending on L. These pairs are removed recursively (while loop). 
Eventually a set of labels Reset is obtained representing all the constraints that are 
subsequently reset (i.e. re-told). As the initial constraint store is consistent, the reset is 
successful (for loop) and defeat_2 returns the new constraint store. 

Example 4b. Defeat constraint I - ab+bc+I from LCS2 of Example 4. 

Since LCS2 = {I-{2}-a/I, I-{}-b/I, 2-{I}-c/0}, defeating constraint I begins by 
removing the first two elements of LCS2, resulting in LCS = {2-{I}-c/0}. Its only 
element, obtained from constraint 2, depends on constraint I and thus RS={2}. The 
first loop removes this element of LCS and makes Reset = {2}. Constraint 2 is then 
retold (c/0 cannot be obtained from constraint ab+ac+I, alone) and once this is done 
the labelled substitution { 2-{}-a/I, 2-{}- b/c+1} is returned. 
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4.4 Discussion 

The technique presented keeps constraint stores much smaller than using witness 
variables, but does not provide minimal conflict sets. Although this is typical when 
using data dependency techniques, there is room for improvements. In the example 
above, the constraint store after telling constraint 1 is {l-{}-a/c+l,l-{}-b/l}. 

Applying this substitution to constraint 3-a+b+c+l results in 3-{l}- 
(c+l)+l+c+l which simplifies to 3-{l}-l, unsolvable, yielding the minimal conflict 
set {1,3}. Clearly, the value of c is irrelevant to the conflict. But as constraint 2 
originated pair 2-{l}-c/0, and this is replaced in the unifier of a from constraint 1, 
LCS2 contains the labelled pair l-{2}-a/l. This introduces a data dependency of a on 
constraint 2, which is spurious with respect to the conflict set {1,3}. 

Such spurious dependency could be avoided if c/0 were not composed with 
a/c+1. Usually this composition is done in non-defeasible settings, as the main goal is 
to detect failure efficiently. If the composition is not done at once it might have to be 
done several times as more constraints are told, to check consistency of all the 
constraints. However, in a defeasible setting, delaying the composition may pay off. 
On the one hand, the reported conflict sets are smaller. On the other hand, if less (and 
spurious) data dependencies are considered, the algorithm to defeat a constraint 
becomes more efficient, since less constraints are reset uselessly. 

Another problem in detecting conflict sets is the sequence in which substitutions 
are applied to the constraint. For example, if constraint 3-ab+l is told to constraint 
store {l-{}-a/l, 2-{}-b/0}, replacing variable a followed by b results in3-{l,2}-l that 
returns conflict set {1,2,3}. But if variable b is applied first, 3-{2}-l is obtained 
returning the (minimal) conflict set {2,3}. It thus pay off to simplify expressions after 
applying a substitution, and start by those with less dependencies. Such improvements 
are analysed more thoroughly in next section. 

5. DCS through Improved Data Dependencies 

5.1 Checking Consistency and Causes for Inconsistency 

Spurious data dependencies may be avoided (to some extent) if new constraints are 
solved on fresh variables, i.e. those having no substitution in the constraint store. If 
one such variable exists, it should be selected (though carefully avoiding circular 
dependencies). This is the intended functionality of the pre_solve procedure below. 

If it is possible to convert a constraint in a form A*v+B the procedure returns the 
substitution for v and a new constraint not including this variable, nor any variable 
previously defined in the constraint store LCS that depends on v. If needed, some 
pairs in LCS are applied to the constraint to eliminate such variables, in which case 
the label of the constraints on which these pairs depend are returned as set W. 

procedure pre_solve(in C: constraint; LCS: labelled_substitution, 
out SO: Substitution, CO: constraint; W:set of Labels); 
if convert (C, LCS, A*v+B, W) then 
CO := (1+A)B, SO := { v/ ( 1+A) v ' +B } 
else CO := C, SO := {} , W := {} 
end if 

end procedure 
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The second change, tell_2i, modifies tell_2 applying the substitutions incrementally, 
one variable at a time, and simplifying the result before replacing the next variable. 
The whole algorithm is presented in function tell_3. 

function tell_3(in L-C: labelled_const ; LCSi: labelled_subst ; 

out LCSo: labelled_subst , Z: setof labels): boolean; 
pre_solve(C, LCSi, SO, CO, W) 

LSO := add_label (L-W, SO), tell_3:= tell_2 i (L-CO , LCSi, LCSx, 

Z) 

LCSo := LCSx U LSO, 
end function 



Example 5. Tell labelled constraints {1-a+e, 2-ab+c+l, 3-ae, 4-c+bd, 5-a+d, 6- 
ab+cd+1} to an empty labelled constraint store. 

This example of the new approach, is compared with the previous, by means of 
the evolution of the constraint store when constraints 1 to 4 are told. 
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When the fifth constraint is told no new variable exists, and tell_3 behaves similarly 
to tell_2. The evolution of the constraint and its dependencies is shown below 



5- {} - 


a+d 






5- {4} - 


a+ (1+b) d' +c 


by 


replacing d 


5-{2,4}- 


a+c 


by 


replacing b 


5-{2,4}- 


a+ab+1 


by 


replacing c 


5-{2,4}- 


a+a+ 1 


by 


replacing b 


U1 

1 


1 


by 


simplif ication 



Hence tell_3 detects that the set of constraints {1,2, 3,4, 5} is unsatisfiable and further 
returns the conflict set {2,4,5}. This conflict set is a minimal conflict set, that could 
not be detected by tell_2. Even in its incremental version, based on the constraint 
store presented on the left column, the only conflict set that could be detected was 
{1,2, 3,4, 5} regardless of starting applying LCS4’ on variables a or d. 

5.2 Defeating Constraints 

Defeating constraints in the new approach is similar to before, although the reset sets 
are usually smaller. This is illustrated in the following example. 

Example 5b. Defeat labelled constraint 2 from the constraint store LCS4 of example 
5, and then tell constraints {5-a+d, 6-ab+cd+l}. 

Defeating constraint 2 imposes that constraint 4 is reset, before telling constraint 
5 (in the previous approach, given the dependencies stored in LCS4', defeating the 
second constraint would reset not only constraint 4 but also constraints 1 and 3, i.e. all 
constraints would be reset!) 
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Defeat 2 




LCS4a 


l-{} -a/e 
3-{l}-e/0 


Reset 4 - 


c+bd 


LCS4b 


1-U -a/e 

3- {l}-e/0 

4- {} -c/bd 


Tell 5 - 


a+d 


LCS5 


!-{} -a/e 

3- {l}-e/0 

4- {} -c/bd 

5- {} -d/a 



Constraint 6-ab+cd+l, has a single variable, b, not defined in LCS5. As e depends on 
b, e is first replaeed to avoid eireular dependeneies. Applying tell_2i results in 

6- {4} - ab+bd+1 by replacing c 

6- {4, 5}- ab+ba+1 by replacing d 

6-{4,5}- 1 by simplification 

whieh fails and returns the (minimal) eonfliet set {4,5,6}. 

5.3 Comparing the Data Dependency Approaches 

The previous examples informally provided some insight on the main features of the 
two data dependeney approaches, as well as on their differences. This section 
addresses some formal characteristics of the approaches. 

Proposition 2. Procedure tell_3 does not always detect minimal conflict sets (even in 
the best-case). 

This property is inherent to the way in which constraints are maintained in the 
structure store. Some constraints have no reference to them free from data 
dependencies, in which case telling the opposite constraint may not return the minimal 
conflict set. For example if constraint 7-ae+l (opposed to constraint 3) is told to 
constraint set LCS4 of example 5, the minimal conflict set {3,7} cannot be reported. 
Instead, conflict set {1,3,7} is reported, since the labelled pair on variable e, 
originated in constraint 3, carries with it the data dependency on constraint 1. 

Proposition 3. The constraint stores produced by tell_3 introduce no more data 
dependencies than those produced by tell_2. 

This is due to the way they have been developed. In fact, as can be checked on 
example 5, they maintain the same information (given the same set of told and 
defeated constraints) but stored in different way. One may regard tell_2 as composing 
eagerly all the substitutions obtained in the process, whereas tell_3 composes these 
substitutions lazily. Nevertheless it can be shown that given the same sequence of told 
and defeated constraints, constraint store LCS_2 produced by tell_2 is the fixpoint of 
the composition of LCS_3, the constraint store produced by tell_3, with itself The 
following proposition thus follows. 

Proposition 4. All conflict sets returned by tell_3 are included in those of tell_2. 

5.4 Redundant Constraints 

In general it is important to detect redundant constraints, i.e. those entailed by others 
already told, as they increase the size of the constraint store (cf. the solver over linear 
rational constraints [HoMB96, KoHo96]). Once a constraint is entailed by others it 
may be ignored, as long as those entailing it are not defeated. 
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Similarly to conflict sets, entailing sets, with respeet to a eonstraint C, eould be 
defined as a set of constraints which entail C. A minimal entailing set w.r.t. constraint 
C is one whose strict subsets do not entail C. 

Detecting redundancy, namely minimal entailing sets, is a difficult issue in 
systems using reduetion teehniques, and is based on the faet that telling a redundant 
constraint does not decrease the domains of its variables [MeBa93, CoDR96] and all 
the constraints that decreased the domain of these variables are potentially in the 
entailing set of the redundant constraint. In our boolean setting, that uses a solved 
form implementation of the eonstraint store, all that is required is to cheek whether 
telling the new constraint changes the constraint store. 

Using function tell_2, a constraint L-C is redundant with respeet to others if the result 
of the apply_2 function is L-Z-0, is which case Z is an entailing set of C. 

Example 6. Tell constraint 8-ab+bd+l to constraint stores of example 5. 

Applying (apply_2) LCS4’ to the eonstraint returns 8-{l,2,3,4}-0, so the 
eonstraint is entailed by the set of eonstraints {1,2, 3,4}. Alternatively, applying 
(apply_3) LCS44 to the constraint returns 8- {2,4}. 

In this case it is easy to see that the minimal entailing set is {2,4}, and that this is 
indeed the entailing set returned by telling (tell_3), but not by tell_2. The following 
propositions are justified with similar arguments used in the previous section. 

Proposition 5. Procedure tell_3 does not always detect minimal entailing sets. 

Proposition 6.A11 eonflict sets returned by tell_3 are included in those of tell_2. 

6 Conclusion 

This paper discussed three alternative extensions to a classical constraint solver over 
the booleans in order to make it defeasible, and embeddable in a general architeeture 
for defeasible constraint solving. This complements previous work on defeasible 
solvers over the finite domains and linear constraints over the rational numbers. The 
first approach uses witness variables and guarantees the detection of all minimal 
conflict sets of eonstraints, but introduees important eomplexity. The other 
approaches detect conflict sets by means data dependencies that are incrementally 
computed (either eagerly or lazily). Although fewer formal characteristics are derived, 
data dependeney techniques seem more promising when dealing with moderate to 
high number of eonstraints. 

The most promising approach (performing a lazy composition of substitutions) 
relies on selecting a variable from the constraint that is told, but may still be 
improved. The choiee may prove wrong at the light of new told constraints and, in 
some eases, they can be changed “on the fly”, without baektraeking. Not negleeting 
the study of formal characteristics, we intend to test such improvements in problems 
with considerable size, to check the praetical limitations of the methods. 
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Abstract. The design of the sequential control program for a manu- 
facturing system is a difficult task which is traditionally carried out by 
engineers. In this paper we present MACHINE, a nonlinear planner with 
an automata-based representation of operators, which is able to obtain 
control sequences for manufacturing systems. These control sequences are 
an abstract representation of a sequential control program which may be 
easily translated into real programs expressed as GRACEET charts or 
Petri nets. 



1 Introduction 

The design of sequential control programs for manufacturing systems is a dif- 
ficult task which is traditionally carried out by engineers. Artificial intelligence 
planning techniques have proved to be very useful in the building process of 
such programs [5,7,9,10,14] obtaining error- free programs and saving engineer- 
ing time, which makes it an area of increasing interest in the AI community. 
However, there are some features of manufacturing systems that are not consid- 
ered or are not considered in enough detail. This work focuses on these features, 
perhaps the most important ones from a qualitative viewpoint, and in the build- 
ing of a planning system able to deal with them. The reason for doing this is 
to show that the reasoning process about the actions that take place in a man- 
ufacturing system is slightly different from the process followed in most known 
artificial intelligence planners and that the results obtained could be more real- 
istic if these features were considered. 

In the next section, these features and their motivation are presented. The 
next sections are devoted to explaining how these features affect the model of 
action and how a general nonlinear planning scheme, like the one presented in 
POP [15], may be adapted to deal with these features configuring a planning 
scheme called MACHINE. The last section explains how some other interesting 
features should be included in this planning scheme. 

This work has been supported by the CICYT under project TIC-0453. 

Helder Coelho (Ed.): IBERAMIA’98, LNAI 1484, pp. 409-420, 1998. 
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2 Description of the Problem 

A manufacturing system is the set of processes, machines and factories where 
raw products are transformed into higher value manufactured products. A very 
simple manufacturing system, which will be used to introduce the problem is 
shown in Figure 1. 





Tankl 


Heater 


water 


[ 


ry 



Mixer 



Valve 1 

-N- 



Pump 




Valve2 

-M- 




Fig. 1. An introductory manufacturing system 



These transformations are made by the machines of the manufacturing sys- 
tem, called actuators. The operation of every actuator is defined by a finite state 
automata where the states of the automata represent all the conditions in which 
the actuator is intended to be, and every arc from one state to another repre- 
sents an action of the actuator. For example, the automata which describes the 
operation of Valve2 in Figure 1 would be the one shown in Figure 2. 




Fig. 2. The automata which describes the operation of Valve2 



There are many representations for a control program for a manufacturing 
system like, for example, GRAFCET charts. Ladder or Petri nets [6,11], but for 
our purposes, the necessary level of detail to describe a control program is a 
control sequence. A control sequence is an ordered sequence of actions with all 
of the actions of actuators needed to transform raw products into manufactured 
ones. The knowledge embeded in a control sequence is sufficient to reason about 
the behavior of a control program, therefore, aforementioned representations may 
be considered as lower level representations or extensions of a control sequence^. 
For example, a possible control sequence to heat and carry the water from Tankl 
to Tank2 could be like the one shown in Figure 3. 

' In [2] we describe an methodology to translate these control sequences into 
GRAFCET charts and Petri nets. 
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START^TurnOn-Mixer 
END ^ Shut -Valve 1 



•TurnOn-Heater^TurnOff -Heaters 
- Shut-Valve2 TurnOf f-Pump 



TurnOff -Mixer — ^Open-Valvel 

1 

TurnOn-Pump — 0pen-Valve2 



Fig. 3. A small control sequence 



Apparently, a sequence of actions like this could have been generated by any 
of the state of the art planners, however it has some interesting features that 
makes it difficult to obtain mainly due to some inherent features of manufactur- 
ing domains which must be taken into account. Let us see these features. 

The first one is that an action of an actuator is somehow active until the next 
change of state in the automata of the actuator. For example, let us consider the 
action Open-Valvel. It is executed in the sequence and the valve will be open, 
but it will be active until the execution of Shut-Valve 1. Therefore, it seems 
reasonable to consider actions as intervals and that these intervals are defined 
by two points in the plan: the action itself and the next action of the same 
actuator which produces a change of state on it. 

There are many approaches in the literature which consider actions as in- 
tervals like for example [1,3,12,13]. In some of them, the end of this interval is 
defined by the achievement of all of its effects and in the others the end of the 
interval has an implicit relation with the action itself (like for example a known 
duration). However, none of these conceptions adequately fit in this problem. 
Let us consider again the action of opening Valve 1. It achieves its effects at some 
point before the starting of Pump and it continues active until the shutting of 
Valve 1, later in the sequence, that is its interval of execution is [ Open-Valvel, 
Shut-Valve l]. This shows that the end of the interval for the opening of Valve 1 
is neither defined by the achievement of its effects (it is later) nor depends solely 
on it (it depends on the next change of state of the actuator). 

The second one, and very related to the former, is that, if actions are to 
be considered as intervals instead of as points then, in addition to classical 
preconditions, as conditions which must hold before the action, it is necessary to 
define of some kind of simultaneous requirements as conditions which must hold 
during the interval of the action. These requirements are a form of the during 
relation in [1]. For example, let us consider the action TurnOn-Pump. It requires 
the valves to be open before the pumping starts, but it also necessary for them 
to remain open until the end of pumping, that is, during the interval of execution 
defined as [TurnOn-Pump, TurnOff -Pump]. This kind of requirements is also present 
in the literature. They also appear in [13] and [7], but the difference here is that 
the interval which defines the protection for these simultaneous requirements 
doesn’t end with the achievement of the effects of the action, but rather with 
the next action of the actuator that makes it change its state. 

If actions are not considered as explained above, it is difficult to guarantee 
that valves should be open during the interval of execution of the pump, that 
is, in the interval [TurnOn-Pump, TurnOff -Pump], or that the water should be in 
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agitation during the heating of the water or that Valve 1 should be closed during 
the agitation of the water. 

And finally, if one thinks about the example from a causal viewpoint, then a 
strictly correct sequence would have also been the one shown in Figure 4 because 
there is nothing that tells the actuators to be off once the water is in Tank2. 

START^TurnOn-Mixer^TurnOn-Heater 

i 

Open-Valve 1 

i 

END^TurnOn-Pump ^0pen-Valve2 
Fig. 4. An alternative control sequence. 



However, the truth is that there are safe states in the automata which de- 
scribes the operation of actuators and that these states must be reached by every 
actuator before the end of the sequence, so the correct sequence is actually the 
one shown in Figure 3. One way to introduce this new feature in the process for 
the building of sequences could just be by including these safe states in the goal 
of the problem. Although this achieves a safe state for every actuator it seems 
too global, that is, it may be difficult to decide the point in the sequence in 
which the actuator reaches a safe state, or even if it would be necessary to use 
the same actuator later in the sequence and return it to a safe state. It seems 
that this decision is specific to each action that doesn’t leave it in a safe state. 
Therefore, the need to leave the actuator in a safe state can be modeled as a 
later requirement of the actions, that is, as a condition that must hold after the 
action. In the example of the valve, the safe state is the one in which the valve 
remains shut, so the need to shut it as soon as possible could be modelled like a 
later requirement of the action which opens the valve. 

These are the basic features which must be taken into account in order to 
enable a planning system to reason about the actions that take place in a man- 
ufacturing system. Since they do not fit adequately either into known models 
of action which consider actions as a point in a sequence, or in others which 
consider actions as intervals, the following model for actions and plans has been 
defined. 



3 A Model for Actions and Plans 

Every actuator in a manufacturing system is represented as an agent whose 
operation is described by a finite state automata. Thus, every agent has a set 
of states S, which describe all the possible conditions in which it is intended to 
be, and a set of aetions A, each of whom describe a transformation as well as a 
change of state in the agent. Additionally, an agent has a name A/", which must 
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be unique, a set of variables V, which are used to represent the objects related to 
the operation of the agent (like for instance products, chemicals, interconnections 
points between agents or constants) and a set of eodesignation eonstraints C 
defined on the set of variables, which define the set of valid values for every 
variable. 

Agent = (A/", V, C, A) 

Every aetion of every agent is defined by a unique name A/*, a set of effects, 
which is represented by means of an addition list AW, and a deletion list DSC 
of literals that represent the transformation made by the action, and a set of 
requirements, divided into a list of previous requirements AMT, that must hold 
before the action, a list of simultaneous requirements VUIZ which must hold 
during the action and a list of later requirements VOST , that must hold after 
the action. 

Action = {M, AW, V8C, AMT, VUU, VOST) 



Example 1 This example roughly shows how Valve2 seen previously eould be 
deseribed by this model (using a Lisp-based notation). 



(AGENT 

(N Valve2) 

(E OPEN SHUT) 

(V 7S0URCE ?IN TOUT ?CHEM) 

(C (7S0URCE NIL) 

(7IN (PUMP)) 

(TOUT (TANK2)) 

(7CHEM NIL)) 

(A 



(ACTION 

(N Open-Valvel) 

(ADD (STATE Valve2 OPEN) 

(OPEN-FLOW 7CHEM 7S0URCE TOUT)) 
(DEL (STATE Valve2 SHUT)) 

(ANT (STATE Valve 2 SHUT) 

(OPEN-FLOW TCHEM TSOURCE TIN) 
(CONTAINS TCHEM TSOURCE)) 

(DUR (OPEN-FLOW TCHEM TSOURCE TIN)) 
(POST (STATE Valve2 SHUT))) 

(ACTION 

(NAME Shut -Valve 2) 

...)) 



) 



The description of the problems that appear in a manufacturing system con- 
sists of a set of transformations which must be made on raw products in order to 
obtain the manufactured ones. Although most of the manufacturing processes 
are quite complex, in this paper only simple transformations are considered; 
however, they are expressive enough to show the main difficulties during the 
building process of a control program. Thus, a problem V = {V,T, Q) is defined 
by the following components. 

Domain. A domain is a knowledge-based model of the manufacturing system 
and it is divided into a set of agents, which represents the set of actuators. 
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their operation and their interconnections described by this model of action, 
and a set of axioms, which describe facts which are always true. 

Initial. The initial state T is a conjunction of literals which describe the initial 
state of both the manufacturing system, and the raw products. 

Goal. A goal 5 is a conjunction of literals which describe the transformation 
needed to obtain manufactured products from raw ones. 

The solution to these of problems consists in an ordered sequence of actions 
of the agents of the domain which achieves the goal starting from the specified 
initial state. This can be called a control program or an operation procedure [14], 
but in this paper it will be called a plan. This is only a structural description of 
what we consider a plan, the following explains in detail the semantics behind 
this conception of plan. 

— The interval of every action is defined by two points. If the action is the last 
one carried out by an agent, then the interval is defined by itself and the 
dummy action END, otherwise, the interval is defined by itself and the next 
action of the same agent. 

— Actions whose intervals overlap can be considered in parallel, that is, they 
are executed simultaneously. 

— Although the examples seen so far show a total order of actions, plans can 
have a partial order structure. A partial order is used not only to represent 
a class of total order plans, but also to represent possible parallelism. For 
example let consider the manufacturing system shown in Figure 5. 




Fig. 5. A second manufacturing system 



A plan to carry ACID from TANKl to TANKS and WATER from TANK2 to TANK4 could 
be the one shown in Figure 6. The fact that both branches of the sequence 
are unordered means that there is no commitment between them; therefore 
the intervals of actions in both branches could possibly overlap, that is, they 
could be possibly in parallel. 

— An immediate consequence of considering actions in parallel is that if the 
intervals of two actions can possibly overlap, then both actions should not 
interfere, that is, they must not have any opposite effect. 
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Fig. 6. A partially ordered plan 



— Causal links are defined as intervals of protection for the literals that appear 
in the requirements lists. Since there are three lists of requirements of differ- 
ent nature, the interval which defines a causal link can differ depending on 
the type of requirement. The causal link associated with a previous require- 
ment is defined from the producer of the literal until the consumer (in terms 
of [15]). When the requirement is a simultaneous one, then the causal link 
is defined from the producer until the end of the interval of the consumer. 
Later requirements have a different nature, they only need to be satisfied 
and they do not need to be protected throughout the plan like previous or 
simultaneous ones. Therefore, causal links with respect to later requirements 
are not considered. 

— An action threatens a causal link if the literal associated to the causal link 
appears in the deletions list of the action. Since causal links represent inter- 
vals of protection for these literals, the interval of an action which threatens 
a causal link and the interval of the causal link must never overlap. 

— Neither interferences nor threats are allowed in a valid plan and they must 
be avoided by the usual methods of promotion and demotion. 

— The only notion of time in a plan like the ones in Figures 3 and 6 is the 
relative ordering between its actions, and this is only a qualitative notion like 
in [1] and [13]. The inclusion of a metric notion of time would be, of course, 
useful, however the main problem that appears in the building process of 
such a plan is the search for a correct interleaving of the actions and a 
qualitative notion of time is quite enough, although it is more conservative 
than a metric time, which would surely provide a more precise interleaving. 

Bearing in mind these conceptions of actions, problems and plans, the follow- 
ing section describes a nonlinear planning scheme, called MACHINE, designed 
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by adapting the general ideas presented in POP [15], and is able to obtain the 
plans seen so far. 



4 MACHINE: A Nonlinear Planning Scheme 

MACHINE is a generative refinement planning scheme [15] whose algorithm is 
shown in Figure 7. The start point is a null plan with two dummy actions, START 
and END which encode the planning problem as in SNLP [8] and UCPOP [15]. 
Over this initial plan, a refinement process is applied which, at every step, solves 
a pending problem in the plan until there are no more pending problems or 
the problem cannot be solved. The different pending problems which may be 
found in a plan are pending subgoals (motivated by unsatisfied requirements) 
threats, interferences, and order inconsistency (motivated by a loop in the order 
structure, which must be a strict order). They all are included in an Agenda, 
which drives the refinement process. The search process to solve the tasks in the 
Agenda is a basic depth first engine over the set of choices to solve every task. 



MACHINE (Domain, Agenda, Plan, Links) 

1. When Agenda is EMPTY Return SUCCESS 

2. Task ^ Select Task (Agenda) 

3. Choices ^ HowToDoIt?(Task, Domain, Plan) 

4. Iterate over Choices until it is empty 

(a) How ^ Extract First (Choices) 

(b) Dolt (How, Domain, Agenda, Plan, Links ) 

(c) When MACHINE (Domain, Agenda, Plan, Links) Return SUCCESS 

5. Return FAIL 



Fig. 7. The algorithm of MACHINE 



MACHINE uses four data structures to store the information during the 
planning process: an Agenda, the Plan and its Links, and the Domain in consider- 
ation. The Domain is the knowledge-based model of the manufacturing system. 
The Plan is a partially ordered set of nodes, where every node may be an in- 
stantiated action from the Domain or a subgoal, together with its Links, that is, 
the set of the existing causal links, which describes the causal structure of the 
plan, the plan rationale. And the Agenda which is a set of tasks each of which 
describes a pending problem in the plan. 

The three basic modules of MACHINE (shown in boldface) are described as 
follows. 

Select Task. This module selects the first task in Agenda in order to solve it. 
Tasks in Agenda are ordered using the following scheme: first, order inconsistency, 
then interferences, threats and subgoals. Order inconsistency is the first one 
because it has no solution and it always leads to backtracking. Subgoals are 
the last ones because they are delayed until all the interferences and threats 
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are solved. In addition to this, subgoals are also ordered amongst them by their 
relative ordering in such a way that subgoals closest to START are solved before 
the furthest ones. 

HowToDoIt?. This module builds a list with all the possible choices to 
solve a selected task from the Agenda. An inconsistent order has no solution, 
so the list will be empty. The choices to solve interferences and threats are the 
known methods of promotion or demotion, nondeterministically. Subgoals may 
be solved by either the axioms, an existing action in the plan or a new action from 
the domain. Since this process is based on a most general unifying algorithm, 
the codesignation constraints defined on the variables of the agents will play an 
important role by rejecting undesirable unifications. 

Dolt. This module applies one of the existing choices in the list built by 
HowToDoIt? to solve a problem. Pending sub goals related to simultaneous 
and previous requirements are solved by producers which must be before the 
consumer action, and later requirements are solved by actions which must be 
after the consumer. The inclusion of a new action in Plan to solve a pending 
subgoal implies the following tasks. First, the inclusion of all of its requirements 
as pending subgoals in Agenda and the inclusion of the new causal link in Links 
(if an existing action were reused, the causal link were also included). 

Second, when a new action is included in Plan, the end of its interval of 
execution is unknown, so, by default, it is assumed that its end is the dummy 
action END. However, the true end of all of the actions in the plan is continuously 
searched as shown in the previous sections. Since the end of the interval of an 
action implies a change of state in the agent which carries out the action, every 
time a requirement of change of state is solved by an action then the end of the 
interval of this action has been found, and it is updated in Plan and the causal 
links in Links related to this interval. 

And third, if new interferences or threats have appeared, then include it in 
Agenda also as pending tasks. Interferences and threats are found when there is 
harmful overlapping between the intervals of execution of actions and the inter- 
vals defined by a causal link as explained in the previous section. Promotions and 
demotions to solve interferences and threats are not applied between actions but 
between their intervals of execution. When an action is promoted (or demoted) 
over another action, it is promoted over all its interval of execution, not only the 
action. 

However MACHINE can delay the solution of some threats and interferences 
for a later moment in the resolution process. The reason is that, as mentioned 
before, not all of the intervals of the actions in Plan are known, some of them 
are known and some of them will be known as pending subgoals are solved. 
Therefore, if a threat or an interference is related to an undefined interval then 
it should be delayed until the end of the involved intervals are known. In order 
to do that, these kinds of threats and interferences are ordered in Agenda after 
pending subgoals giving them less priority. Later, if the solution of some of 
these subgoals finds the end of some of these problematic intervals. Plan will be 
updated and the threat of interference will be back at the begining of Agenda 
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and, so, solved appropriately. This is also a least commitment heuristic which 
could more or less say the following: don’t try to solve a problem if I don’t 

know it exactly”. 

5 Experimental Results 

MACHINE has been implemented in COMMON LISP and has been tested using 
the problems shown throughout this paper. It found the correct plan, i.e. control 
sequence, for all of them and its behavior is shown in Table 1. The final result 
of MACHINE is a control sequence, it is not exactly a control program but it 
has the necessary level of detail to be considered as such. Eurthermore, in [2] we 
show in detail how these control sequences may be translated into GRAECET 
charts [6] and Petri nets [11], as true representations for a control program, and 
very useful tools in the design and modelling of manufacturing systems. 



Table 1. Some experimental results 



Plan 


Generated 

Nodes 


Explored 

Nodes 


Time 


Figure 3 


56 


39 


11 s 


Figure 6 


69 


51 


16 s 


Figure 8 


217 


144 


374 s 



This table also includes the results of the real-size manufacturing problem 
shown in Eigure 8 which gives an idea of the scalability of MACHINE in terms 
of complexity. 

The problem consists in adding an ingredient (which is initially contained 
in ADDITIVE- 1) to the milk initially contained in MILK-TANK and then proceed 
to bottle the mixture. The domain built for this manufacturing system has 4 
axioms, 24 agents (valves, pumps, mixers, heaters, conveyor belts and a bottler) 
and 48 different actions. The final plan, which is not shown here due to space 
limitations, involves 40 actions. 

6 Conclusions and Extensions 

This work has been motivated by the need to apply artificial intelligence plan- 
ning techniques to the design of control sequences for manufacturing systems. 
This need has shown that the domain of manufacturing system has some ba- 
sic properties which must be taken into account in order to ensure a correct 
reasoning about the actions which take place in such a domain. The planning 
system presented in this paper, called MACHINE, deals with these features by 
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Fig. 8. A real world manufacturing system 



using a nonlinear planning scheme based on POP and it is able to obtain control 
sequences for manufacturing systems. 

Although MACHINE has been tested successfully on several problems it can 
only be considered as a step forward in the resolution of the problems which 
appear in manufacturing systems. The reason is that this is a very rich domain 
with many problems of different natures which should be taken into account 
by an autonomous problem solver, however, the truth is that the core of that 
problem solver is actually a planning system and that all of these problems 
may be built like folders or extensions over this planning core, configuring an 
integrated system. Some of these important problems, which will be dealt with 
in the near future, are the following: 

— Perhaps the most important problem is the inclusion of a metric time to 
quantify the intervals of actions, although in a flexible manner because even 
in real problems these intervals are not perfectly known. Time map man- 
agers [3,12] seem to be very promising in this task. 

— Manufactured products are really complex and a classic conjunctive goal is 
not enough to deal with complex goals. It is necessary to define behavioural 
goals as an ordered set of transformations on raw products, which is known 
as a recipe. At present, MACHINE does work with goals whose literals have 
a partial order structure, but it will be dealt with in a forthcoming paper. 

— In these domains, there are what could be called procedures: complex prob- 
lems which can be decomposed into an ordered sequence of smaller sub- 
problems. The planning system must know these procedures and it must 
also know how to work with them. This problem points directly to HTN 
techniques [4]. 
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— The control programs seen in this paper are intended to work in an open 
loop manner, that is, with no feedback from the environment. Real control 
programs have feedback from sensors in the environment and the planning 
system must be able to include the information supplied by these sensors in 
the planning process. This seems the most challenging problem because it 
implies both: 

a) The ability of the planning system to adapt its behavior to the different 
ways in which sensors may appear. Case-based and analogical techniques 
seem very promising in this task because they also seem to be the tech- 
niques used by humans in the same role. 

b) And the ability to include some kind of conditional behavior in the plan 
because the information given by sensors is not always available at plan- 
ning time. 
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